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A PROGNOSTIC TEST FOR STUDENTS IN DESIGN 


J. P. GUILFORD AND RUTH B. GUILFORD 
Unwersity of Nebraska 


Among the many tests of special talent only a few have been 
designed to gauge that most intangible trait of ‘‘artistie abil- 
ity.”’ The tests that have been constructed for this capacity 
are tests of appreciation rather than of performance or pro- 
duction.* We are willing to concede that an ability to appre- 
ciate good forms in art is a minimum essential to the artist. 
But nowhere has it been demonstrated that such an ability is 
a sufficient condition for producing creative work in art. 
Common sense would tend to show that there are thousands 
of people who are appreciative of artistic production whereas 
only the few are really creative. Tests of appreciation, there- 
fore, measure some of the necessary conditions but not the 
sufficient conditions for artistic production. Having given 
such tests, one can tell who will not succeed as artists, perhaps, 
but high scores in the tests will not guarantee success in art. 

It is the contention of the writers that a real test of creative 
ability in art should call upon the subject to produce some- 
thing as an expression of that ability. In this study we have 
limited ourselves to the pictorial arts, in particular the field 
of design. The line-drawing test which will be described is 
only one of several that were used in an attempt to establish 
a battery of tests for students in design. The results from 
this test alone were so satisfactory that we feel justified in 
making this preliminary report. Only two other measure- 
ments in the proposed battery give any promise of prognostic 

1 For recent examples of tests of artistic talent see A. S. Lewerenz, 
Scientific measurement in the realm of art, 1927 Yearbook of the South- 


western Educ. Res. & Guid. Assoc. N.C. Meier, A measure of art tal- 
ent, Psychol. Monog., 1928, 39, 184-199. 
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value. What they will add to the line-drawing test in pre- ° 
dictive value is still to be determined. : 
eff 


In our ‘‘job-analysis’’ of the work of the designer we kept 
in mind the fact that he must express himself through the 
medium of rather simple and somewhat conventionalized lines. 
Our test therefore attempted to measure the ability to express 
certain feelings or attributes through the use of simple lines. 
The reader may recall the previous studies of Lundholm? and 
of Poffenberger and Barrows* on the feeling value of lines. u 
The former study found that the average individual when 
given an adjective, e.g., ‘‘furious,’’ or ‘‘lively,’’ could draw fa 
some simple line, in the form of a wave, curve, or angular 
pattern, to represent the feeling implied. The latter found 
that when such lines are shown to a large number of subjects 
who are asked to name the feeling expressed, there is a sur- 
prising agreement upon the feeling named. Neither study 


made a point of individual differences in the ability to express “ 
or to appreciate feelings in lines. That is our problem. 
The first task in constructing the test was to select a limited { 


number of suitable, representative adjectives. Lundholm had 
used 48 adjectives and found that these could be classified into 
13 groups according to meaning and according to expression 
in lines. The writers decided that a representative list could 
be made by choosing one or two adjectives from each class. 
Lundholm’s list of 48 adjectives was given to a group of 30 
subjects who were asked to draw the most expressive line for 
each word. Those adjectives were chosen for the test which 
had the most agreement in the kind of lines used to express 
them. It might seem that this would result in the selection 
of the easiest adjectives for the test. This may be true, but 
we believe that there is still sufficient disagreement even on 
the easiest ones to make them of diagnostic value. It was 


2H. Lundholm, The affective tone of lines, Psychol. Rev., 1821, 28, q 
43-60. 

3 A. T. Poffenberger, and B. E. Barrows, The feeling value of lines, 
J. Appl. Psychol., 1924, 8, 187-205. | 
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necessary to use adjectives that gave sufficient agreement in 
order to find any norms for scoring the results of the test. 
Then, too, the easiest 24 out of 48 were not chosen, but the 
easiest one or two out of each class. Some classes are undoubt- 
edly easier than others, and the number of words in each class 
varied from one to five. The writers feel, therefore, that the 
24 which were chosen are representative, that they are easy 
enough to give us reliable norms, and still difficult enough to 
be diagnostic. No study of the diagnostic value of each indi- 
vidual adjective has yet been made. 

The amount of agreement in the responses of the 30 sub- 
jects was determined by comparing three aspects of the lines, 
the form, direction, and type. The lines ean be described as to 
form, in four ways: wave , curve (NJ, angle A/V, 
and straight line Three directions were distin- 
guished: up, (upward slant), down, (downward slant), and 
horizontal. All vertical lines were classified as horizontal. 
They occurred very rarely. Three degrees of heaviness or 
shading constituted the variations in type: heavy, medium, 
and light. The form and direction of the lines had been 
found to be significant by the previous investigators. We 
found the type or shading to be even more significant, at least 
of individual differences. This classification gives ten cate- 
gories upon which to grade each line in the test, 7.e., four for 
form, three for direction, and three for type. 

The 24 adjectives which were finally selected were: sad, 
forceful, dead, earnest, playful, tranquil, lively, cruel, joyous, 
quiet, grave, gentle, lazy, fiery, furious, jolly, hard, agitating, 
angry, faint, delicate, idle, sorrowful, strong. 

It was impossible to establish the norms for form, direction 
and type from the results of only 30 subjects. The test was 
repeated with 144 new subjects, using the 24 adjectives. Of 
these 144 subjects, 55 were students of design. From these 
results the frequencies with which each line fell into each of 
the ten categories listed above were tabulated. For example, 
the adjective ‘‘idle’’ was represented by an angle 9 times, 
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curve 57, wave 28, straight line 50. As to direction, these 
lines were horizontal 110 times, up 9, down 25. As to type, 
they were heavy 4, medium 56, light 84. The adjective 
‘fangry’’ gave the following results: angle 59, curve 20, wave 
26, straight line 39, horizontal 77, up 50, down 17, heavy 107, 
medium 30, light 7. The relative frequencies might be 
changed with the addition of the results of more subjects, but 
the writers are inclined to think that there would be few 
significant alterations. At any rate, this gave a preliminary 
basis for developing a method of scoring the test. 

The problem next arose as to the weighting of these fre- 
quencies. Should we take for each line simply the most 
frequent form, direction and type and score each one point? 
Obviously, the modal categories have different frequencies 
and sometimes there are two categories under form, direction 
and type with about the same frequencies. For example, 
**jolly’’ has 60 curves and 62 waves; ‘‘lively’’ has 56 medium 
and 51 light lines. Again, some lines are so rarely found 
under some categories that it would seem proper to give them 
a negative score when they do appear, e.g., ‘‘playful’’ is only 
once represented by a straight line; ‘‘gentle’’ is only once 
given an angle; ‘‘lively’’ is only once given a heavy line. 

The following scheme of weighting was finally devised. It 
may be indefensible statistically, but it gives results. The 24 
frequencies under each of the ten categories were assumed to 
obey the law of normal distribution. This was not strictly 
true, for in almost every case the distributions were skewed 
toward the zero end of the scale. We proceeded to find the 
standard deviations of the ten distributions, however, and 
also the means. Each frequency was then translated into a 
sigma value above or below the mean. For example, the 
average frequency for curves was 44 with a standard devia- 
tion of 24.5. The adjective ‘‘lazy,’’ which was expressed 85 
times as a curve, was accordingly given a value of plus 2, 
because 85 is about 2 sigma above the mean. The adjective 
‘*foreeful,’’ which had a frequency of only 8 for curves, 
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would be given a value of minus 2. In the end, all sigma 
values were made positive by a shift in the zero point, by add- 
ing two points to every value. Thus, each line could be 
scored in each of the ten categories on a scale which ran from 
zero to four inclusive. 

The method of scoring is somewhat unwieldy, but the writ- 
ers have found no way of shortening it without seriously 
impairing the reliability and validity of the test. The total 
number of points possible in the test is 200. The actual 
range which we have found is from 122 to 180, or 58 points, 
for the 213 subjects whose tests have been scored by this 
method. 

The test is given to the subjects with the following printed 
instructions: ‘‘In the space below you will find a list of 
adjectives. Take each adjective in turn. You are to think 
of its meaning and then to draw a single line which best ex- 
presses the meaning that it conveys to you. Your lines will 
be graded upon the general form, the direction and the heavi- 
ness of them. You will have ten minutes in which to com- 
plete the test. Begin.’’ 

The instructions were worded, after several preliminary 
trials, so as to produce sufficient uniformity that the method 
of scoring could be used. More freedom allows the best sub- 
jects to show their superiority to better advantage but makes 
scoring impossible. The results show that there is sufficient 
freedom within the limits imposed by the instructions to 
detect the superior as well as the inferior. Ten minutes is 
ample time for all but the very slowest who are given extra 
time to do all the lines. It is not a speed test. We found 
that students of design have formed habits of working slowly 
and carefully and should not be hurried. All subjects were 
required to use lead pencils. 

The reliability and validity of the test have been investi- 
gated with two groups of students in a course in design.* 


4 We are greatly indebted to Miss E. J. Noh, assistant in psychology, 
who administered the tests, and to Miss Louise E. Mundy of the School 
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One group was composed of 45 students who were tested in 
May, 1929. The other was made up of 50 students who were 
tested in October, 1929. We shall refer to the former as the 
A-group and to the latter as the B-group. It is unfortunate 
that the subjects in both groups had had such varied degrees 
of training in art, in the course in design as well as in other 
courses. The B-group contained more untrained subjects, 
since 23 of them signified that they had had no previous train- | 
ing in art. The A-group had had at least one semester of : 
work in design. It is interesting to find that the B-group 
actually had a higher average in the line-drawing test than 
the A-group in spite of relatively less training in art. The 
difference was only 3.5 points, but this is about twice the 
standard error of the difference. This may mean that train- 
ing in art does not improve the ability to do the line-drawing 
test, or it may mean that the B-group had by chance much 
greater ability and did better in spite of a handicap in train- 
ing. We cannot be sure of the cause of the difference, of 
course, without knowing more about the composition of the 
two groups. The difference is sufficient, however, to make it 
necessary to treat the two groups separately in the correla- 
tions which we are now to discuss. The coefficients of corre- 
lation are less reliable because of the smaller number of sub- 
jects in the group, but we have two sets of results instead of 
one, and the one set fully confirms the other. 

Two criteria were used in determining the validity of the 
test, the teacher’s ratings and the school marks in the course 
in design, The teacher of the course came into daily contact 
with the work of each individual student and had ample 
opportunity to form a judgment of his creative ability. The 
ratings were made after at least one semester’s acquaintance 
with the student. The instructor first rated each student on 





of Fine Arts who generously granted the use of her classes and who 
furnished the ratings which gave us the criterion for creative ability in 
our subjects. 
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a five point scale, dividing each class into five sub-groups. 
Her conception of creative ability included two aspects: (1) 
the originality and (2) the fertility of the students’ ideas in 
making designs. The A-group was rated twice, with an in- 
terval of a month between ratings, and the coefficient of reli- 
ability of the combination of these two ratings was found to 
be .82 + .04. On the same basis, we may estimate that the 
reliability of the ratings of the B-group, which was rated only 
once, would be .70 + .06. The instructor was then given the 
names of the students in each sub-group and was asked to put 
them in rank order. We then assigned scores to the subjects 
in both the A-group and the B-group according to Hull’s 
method.°® 

Two methods were used to determine the reliability of the 
test. The first and last halves were correlated and also the 
odds and evens. The coefficients of reliability, after using 
the Spearman-Brown formula, are to be seen in Table 1. 

















TABLE 1 
Coefficients of Reliability of Line-Drawing Test Scores and of the 
Criterion 
LINE-DRAWING TEST SCORES 
GRovP | ———— ~ TEACHER’S RATING 
ist and 2nd halves | Odds and evens 
A .56 + .08 .66 + .07 (.70 + .06)* 
B 72 + .06 7 + .08 82 + .04 








* Estimated 


The reliability is not as high as one would wish, but perhaps 
it cannot be much higher in a test of this sort. Had the test 
items been arranged in multiple-choice form, or some other 
form requiring judgments rather than creative responses, the 
reliability might have been higher. In a productive test, the 
subjects have so much latitude that it is difficult to score the 
results so as to give reliable measurements. Something de- 


5C, L. Hull, Aptitude Testing, 1928, 386-390. 
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pends upon the judgment of the scorer. Perhaps, also, indi- 
viduals vary more in their productive level than they do in 
their level of judgment, and so productive tests can never be 
as reliable as tests of judgment or appreciation. We find 
support for this in the fact that artistic productions of the 
same individual are likely to be of uneven quality, some good, 
and some poor. 

When we correlate the test scores with the teacher’s ratings, : 
the coefficients are .65 and .58. It is a remarkable fact that . 
these coefficients approximate the coefficients of reliability of 
the test itself. It must be remembered that there are large . 

. errors of attenuation in both the test scores and in the cri- 
terion. If we correct for these errors by the method of Dun- 
lap and Cureton,® the coefficients are .84 + .08 and .83 + .08. 
These coefficients represent the true relation between the thing 
measured by the test and the thing which the teacher of design 
means by creative ability. This is an unusually high validity 
to be found for any test of special capacity. The predictive 
value of the test falls short of this degree, however, because 
of the low reliability of the test itself. Correction for attenua- 
tion in the test scores merely glosses over deficiencies that are 
really there and must be dealt with in practice.’ The coeffi- 
cients which give the correlation corrected for attenuation in 
the criterion alone are about .70 (see Table 2). 
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TABLE 2 
Coefficients of Correlation of Line-Drawing Test Scores with: 














. RATINGS, RATINGS, SCHOOL 
GROUP TEACHER 8 CORRECTED IN CORRECTED IN GRADES | 
RATINGS CRITERION BOTH VARIABLES IN DESIGN 
A 58 + .06 .69 + .07 (.83 + .08)* .48 + .07 
B .65 + .06 .72 + .06 .84 + .08 58 + .06 

















* Estimated 


6E. E. Cureton, and J. W. Dunlap, Spearman’s correction for at- 
tenuation and its probable error, Amer. J. Psychol., 1930, 42, 235-245. : 
7J. W. Dunlap, and E. E. Cureton, The correlation corrected for 
attenuation in one variable and its standard error, Amer. J. Psychol., 
1930, 42, 405-407. 
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It is highly gratifying to note that this degree of correlation 
is fully as high as that obtained by a self-correlation of the 
teacher’s ratings. In other words, the test tells us as much 
about the creative ability of the students as the teacher can 
in a single rating after a full semester’s acquaintance with 
their work. 

A certain refinement which might have raised the validity 
of the test was attempted. This improvement consisted of a 
differential weighting of the scores for form, direction and 
type of line. The method of scoring which we used weights 
these three aspects of the lines about equally. The total num- 
ber of points that can be earned in the three aspects are 69, 
65 and 66 and the means actually obtained in the three are 
52, 48, and 56 respectively. We did not know that the three 
aspects were equally significant. The problem, then, was to 
find the relative weights to be assigned to each of the three 
groups of categories, using the teacher’s ratings as the 
criterion. 

The test results of the B-group were scored for form, direc- 
tion and type separately. Each set of scores was correlated 
with the criterion. The raw coefficients were .33, .40 and .49 
respectively. From the raw coefficients it would seem that 
the type of line should carry the most weight and the form 
least. Partial coefficients of the second order were then 
found, correlating each of the three aspects with the criterion 
in turn, holding the other two constant. These coefficients 
were .25, .33 and .38 respectively. Again, the type seems to 
be slightly more significant than the others and the form 
slightly less significant. But the real test of their relative 
significance lies in the regression coefficients. These are .085, 
.082 and .112 respectively. From this we conclude that form 
and direction scores should be weighted equally, but that the 
scores for type should be given about a third more weight. 

This is really very little difference in the weights after all. 
It is hardly enough to make weighing worth while, for the 
multiple coefficient of correlation is only .60. This means that 
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the predictions which we could obtain by using the weights 
suggested above would probably never correlate any higher 
with the criterion than .60. The actual coefficient without the 
trouble of weighting the scores for the B-group was .58, or 
approximately the same. Weighting the scores, therefore, 
seems not worth the effort. 

The correlation of the test scores with the grades made 
in the course in design gives coefficients of 48 + .07 and 
.58 + .06. These coefficients are somewhat higher than those 
usually obtained between general psychological examinations 
and school marks in college. It is of interest to know whether 
the line-drawing test measures general ability and hence du- 
plicates the general psychological tests. These students of de- 
sign had not been given any such intelligence test and we have 
no other data available to show how well such general tests 
predict success in the fine arts, particularly in design. 

We attempted to discover how much the line-drawing test 
overlaps with tests of general ability by obtaining results from 
two other groups of subjects, not students of fine arts. One 
was a group of 69 Liberal Arts students who had had the 
Army Alpha test and another was 48 students in Wells Col- 
lege* who had had the American Council test. Both groups 
were given the line-drawing test. The coefficient of correla- 
tion for the former group was only .18 and for the latter, .08. 
Both coefficients are so small as to be negligible and force the 
conclusion that the line-drawing test measures little or nothing 
in common with the usual tests of general intelligence. Ap- 
parently, it does measure some special capacity, probably more 
emotional than intellectual, a capacity that is identified in 
common observation with creative ability and that is impor- 
tant for success in courses in design. 


SUMMARY 


The writers maintain that tests of creative ability in art 
should secure actual samples of the individual’s work as ex- 


8 We are indebted to Professor C. O. Weber for these data. 
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pressions of that ability. A line-drawing test was constructed 
which called upon the subjects to draw simple lines to ex- 
press various feelings and attitudes. A careful though some- 
what elaborate method of scoring was devised. When ap- 
plied to two groups of students in design, the test was found 
to be fairly reliable and very highly valid when the criterion 
of creative ability was based upon the teacher’s ratings. It is 
concluded that the test, which requires only ten minutes to 
give, measures the creative ability of the students as well as 
the teacher can estimate it after a whole semester’s acquain- 
tance. It does not measure to any appreciable degree the 
ability that is measured by tests of general intelligence. 





CONSUMER PREFERENCES FOR SMALL GLASS 
CONTAINERS 


HOWARD T. HOVDE 
Wharton School of Finance and Commerce 


University of Pennsylvania 


Experimental studies in the field of psychology in mer- 
chandising have been somewhat limited, according to an 
article by Viteles.*| In the case of sales and advertising 
managers, an increased interest has been shown by the ac- 
quisition of a supply of psychological jargon which little 
affects merchandising activities and still less reveals factors 
affecting the buying or selling situation. It is not the pur- 
pose of this paper to either deny or defend this statement; 
rather, the writer wishes to present one of many merchan- 
dising investigations of a psychological nature which is apt to 
pass unnoticed because of the selfish interests of those who 
sponsor such investigations. 

This paper reports an investigation which determined the 
consumers’ preference for glass containers for caviar and 
bismarck herring. Franken and Larrabee? have reviewed the 
titles of some of the work in this field and in general the pro- 
cedure which was followed for this particular problem is to 
be found in their book ‘‘ Packages that Sell.’’ 

The purpose of the investigation was to find the best all- 
round glass container for a packaged product when the net 
content of the container remained constant. Two groups of 
different style containers were tested: Group I includes six 
4-ounce containers for caviar ; Group II includes five 10-ounce 
containers for bismarck herring. (See illustrations.) 


1 Viteles, M. 8., Psychology in Industry, Psych. Bull., Vol. 23, No. 11, 
pp. 631-680. 

2 Franken, R. B., and Larrabee, C. B., Packages That Sell, Harper & 
Bros., N. Y., 1928, 302 pp. 


346 





a aes 








a) 





S19UTBJUOD BUNO-F UL JBIALR,) y «noua 


























. - =: _— 
—s 















PREFERENCES FOR SMALL GLASS CONTAINERS 347 


PROCEDURE AS CARRIED OUT IN RETAIL STORES 


The investigation was conducted in a scattered group of 
grocery and delicatessen stores in Philadelphia. The people 
with whom the test was conducted belonged to the general 
class of buyers who were interested in the product. When a 
customer entered the store, the clerk or proprietor, after 
making a sale, would introduce the investigator to his cus- 
tomer as a representative of 8. Skloroff & Sons.* 

While leading the customer over to the table containing the 
various containers, the customer was informed of the purpose 
of the investigation. ‘‘We are trying to pack our products 
in the kind of container that will please our customers. Since 
customers’ preference is desired, the most reliable procedure 
is to come to you directly. I have a number of proposed 
eaviar and bismarck herring containers with me that I should 
like you to look at and tell me which one you like best. Be- 
fore making your selections, I want you to keep in mind this 
fact: caviar is expensive, and it is vital that the construction 
of the container will allow for complete extraction of its con- 
tents. One other thing I would like to inform you about is, 
all the containers, that is all of the caviar group and all of 
the bismarck herring group, cost approximately the same, so 
that the use of one rather than another will not affect the sell- 
ing price.’’ The customer was then shown the containers. 
After he had selected the one he liked best, he was asked to 
select the next best, and so on until he had picked out each 
container in the order of preference. 

‘*Now,’’ the investigator continued, ‘‘You have probably 
noticed that certain containers attract your attention more 

8 The writer is indebted to Arthur Skloroff, a student at the Wharton 
School of Finance and Commerce, University of Pennsylvania, who col- 
lected the data under the supervision of the writer. 

4 The reason for this statement is that it would be of no value to have 
a particular container which in itself is attractive from all angles but 
which would not prove practical from a utility standpoint. The cus- 


tomer must be so thoroughly satisfied with his purchase that he will 
repeat his order. 
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than others. Will you please be good enough to tell me to / 
which container your attention is most forcibly drawn?’’ 
Like the previous selection, values were placed upon each 
package according to order of preference. A third selection 
was made, asking the customer to do the same thing over 
again, but making a choice this time relative to the identifica- 
tion of the container. ‘‘ Which package do you think you can 
most easily remember if you forgot the trade-name?’’ In 
only the first question an accurate tabulation of the reasons 
given for the preference was kept. 


METHOD OF TABULATION AND RELIABILITY OF RESULTS } 


Merchandising investigations have adopted the method of 
rank order as a means of convenient tabulation, and as a test 
of reliability of the data, it has been the practice of various 
investigators to correlate the halves of the measurement. 
This method was followed in the present investigation. Each 
container was given a letter A to F. Every person ranked 
each container in order of superiority from 1 to 6. Scores 
for men and women were kept separately. 

A group of seventy women and thirty men were first | 
studied for each of the three categories, as heretofore ex- 
plained, and the results of this study were determined at 
once. The investigator then proceeded to obtain groups of 
twenty women and ten men. The results of these groups 
were combined and notice was taken whether or not the ad- 
dition of new groups changed the nature of the results. By 
this means it is possible to tell when a sufficiently large num- 
ber of people had been tested in order to make the results 
reliable for the population at large. When a point was 
reached where the addition of new groups did not change 
the nature of the results, the experimental number was con- 
sidered sufficiently large to make the results reliable. The 
exact degree of reliability of the above tests can be measured 
statistically by the Spearman correlation formula. 

Following out this procedure, two hundred people were 
interviewed before it was concluded that the data were suffi- 
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ciently reliable for the purpose of the study. The majority 
of the persons interviewed gave information willingly. For 
those few persons whose time or patience was exhausted before 
they completely ranked all containers, it was decided to diseard 
their results and seek more cooperative individuals as it was 
an easier procedure than correcting for statistical omissions. 
Numerous people were frank in stating that the last two and 
three rank order judgments might be made without any 
definite reason behind them. However, this selection would 
not have any effect upon the conclusions of the investigation 
as the purpose of the investigation centered in the first three 
ranking containers. 


RESULTS 


The results of the experiment are compiled in two groups: 
Group I refers to the smaller containers for caviar; Group II 
refers to the larger containers for bismarck herring. 


Group I 


The upper division of the tables (a) is a tabulation from 
one hundred persons consisting of the first seventy women 
and the first thirty men who were used as subjects in the 
experiment. From this period on, subsequent tabulations 
with each addition of twenty women and ten men were made 
until two hundred interviews were accomplished, when it was 
found that only the slightest variation, with no effect on the 
final result, existed with new additions. The lower division of 
tables (b) combines both the first and last hundred interviews. 

In examining the tables, the reader will note that each glass 
container is given a letter (A, B, C, D, E, F). These letters 
are without significance, since they were selected at random 
and are used only for convenience of record. 

The nature of Tables 1, 2, 3 and 4 have been explained in 
the foregoing paragraphs. Table 5 gives the relative value 
of the results in terms of percentages that is, container C, 
ranking first with a final average of 1.56, is given an arbitrary 
value of 100, and then container B, ranking second with a 
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final average of 2.18 would have a value of .71. The per- 
centage .71 is in direct proportion to the value of the best 
container, as are also those of all the other containers. 

The data from which these tables are compiled show that 
out of two hundred people, one hundred and fifty-two chose 
‘*C”’ as the best container. One hundred and sixty selected 
**C”’ as having the greatest attention value; and one hundred 
and two selected ‘‘C’’ for first ranking in identification value. 

Of the one hundred and fifty-two persons who selected con- 
tainer C for their preference, one hundred and forty gave 
reasons. They are as follows: 
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TABLE 1—SELECTION FOR BEST CONTAINER* 
Table of Averages (Lowest Average the Best) 








CONTAINER WOMEN MEN pete aren poe 

(a) A 5.90 5.85 5.88 6 
B 3.00 2.75 2.92 3 
C 1.62 1.50 1.58 1 
D 3.63 4.25 3.82 4 
E 2.50 2.15 2.40 2 
F 4.35 4.50 4.40 5 

(b) A 5.88 5.85 5.87 6 
B 3.05 2.90 2.91 3 
C 1.43 1.55 1.46 1 
D 3.70 3.90 3.76 4 
E 2.30 2.20 2.27 2 
F 4.64 4.60 4.73 5 

















* Upper division (a) contains first hundred samples. Lower division 
(b) contains two hundred samples, including upper group samples. 
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TABLE 2—SELECTION FOR ATTENTION VALUE* 
Table of Averages (Lowest Average the Best) 














: ; FINAL AVERAGE FINAL 
CONTAINER WOMEN MEN wonnn & MEN RANK 

(a) A 5.71 5.89 5.73 6 

B 2.16 2.04 2.12 2 

Cc 1.22 1.41 1.28 1 

D 4.43 4.13 4.34 4 

? 2.78 2.62 2.73 3 

F 4.70 4.91 4.77 5 

(b) A 5.80 5.90 5.83 6 

B 1.98 2.10 2.02 2 

C 1.30 1.22 1.27 1 

D 4.53 4.10 4.40 4 

? 2.62 2.75 2.66 3 

F 4.77 4.93 4.82 5 








* Upper division (a) contains first hundred samples. Lower division 
(b) contains two hundred samples, including upper group samples. 


TABLE 3—SELECTION FOR IDENTIFICATION VALUE* 
Table of Averages (Lowest Average the Best) 








CONTAINER WOMEN MEN Paty my oy 
(a) A 4.99 4.80 4.93 5 
B 1.50 1.70 1.56 1 
Cc 1.81 1.90 1.84 2 
D 4.71 4.53 4.66 4 
E 2.63 2.44 2.57 3 
F 5.36 5.63 5.44 6 
(b) A 4.80 4.71 4.77 4 
B 1.55 1.77 1.62 1 
Cc 1.95 1.97 1.96 2 
D 4.87 4.68 4.81 5 
E 2.48 2.25 2.41 3 
F 5.35 5.62 5.43 6 














* Upper division (a) contains first hundred samples. Lower division 
(b) contains two hundred samples, including upper group samples. 


. 
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TABLE 4—COMBINED RESULTS OF PRECEDING TABLES 
Table of Averages (Lowest Average the Best) 








CONTAINER WOMEN & MEN FINAL RANK 
A 5.49 6 
B 2.18 2 
Cc 1.56 1 
D 4.33 4 
E 2.45 3 
F 4.99 5 











TABLE 5—RELATIVE VALUES OF RESULTS 

















CONTAINER AVERAGE RELATIVE VALUE 
C 1.56 1.00 
B 2.18 71 
E 2.45 64 
D 4.33 36 
F 4.99 wl 
A 5.49 .28 
Group II 


Tables 6, 7, 8, 9 and 10 in Group II are compiled in an 
identical manner to Tables 1, 2, 3, 4 and 5 in Group I and 
need no further comment. 

The data from which these tables are compiled show that 
out of two hundred people, one hundred and twenty-eight 
chose ‘‘D’’ as the best container, while sixty-eight chose ‘‘B.’’ 
One hundred and seventy-two selected ‘‘D’’ as having the 
greatest attention value; and one hundred and sixty-one 
selected ‘‘D’’ for first ranking in identification value. 

Of the one hundred and twenty-eight persons who selected 
container ‘‘D’’ for their preference, one hundred and twenty- 
six were asked to give reasons. They are as follows: 
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TABLES 6—SELECTION OF BEST CONTAINER* 
Table of Averages (Lowest Average the Best) 











FINAL AVERAGE FINAL 
CONTAINER WOMEN MEN WOMEN & MEN RANK 
(a) A 4.62 4.51 4.59 5 
B 1.64 1.63 1.64 2 
Cc 3.21 3.29 3.23 3 
D 1.32 1.30 1.31 1 
m} 4.2] 4.27 4.23 4 
(b) A 4.70 4.60 4.67 5 
B 1.73 1.81 1.75 2 
C 2.90 2.93 2.91 3 
D 1.48 1.33 1.44 1 
E 4.19 4.33 4.23 4 

















* Upper division (a) contains first hundred samples. Lower division 
(b) contains two hundred samples, including upper group samples. 


TABLE 7—ATTENTION VALUE* 
Table of Averages (Lowest Average the Best) 


FINAL AVERAGE FINAL 

















CONTAINER WOMEN MEN WOMEN & MEN RANK 
(a) A 3.74 4.40 3.94 4 
B 2.15 1.85 2.06 2 
C 4.93 4.60 4.83 5 
D 1.16 1.31 1.20 1 
E 3.02 2.84 2.97 3 
| (b) A 3.87 4.32 4.01 4 
| B 2.08 1.91 2.03 2 
C 4.72 4.63 4.69 5 
D 1.22 1.45 1.29 1 
, 3.11 2.69 2.98 | 3 











* Upper division (a) contains first hundred samples. Lower division 
(b) contains two hundred samples, including upper group samples. 
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TABLE 8—IDENTIFICATION VALUE* 
Table of Averages (Lowest Average the Best) 











CONTAINER WOMEN MEN yond omg ng 

(a) A 4.47 4.44 4.46 5 
B 2.01 1.90 1.98 2 

Cc 4.12 4.41 4,21 4 

D 1.03 1.12 1.05 1 

c 3.37 3.13 3.30 3 

(b) A 4.30 4.50 4.36 5 
B 2.12 1.94 2.07 2 

Cc 4.21 4.32 4.24 4 

D 1.05 1.15 1.08 1 

E 3.32 3.09 3.25 3 

















* Upper division (a) contains first hundred samples. Lower division 
(b) contains two hundred samples, including upper group samples. 


a 


TABLE 9—COMBINED RESULTS OF PRECEDING TABLES 
Table of Averages (Lowest Average the Best) 














CONTAINER WOMEN & MEN FINAL RANK 
A 4.35 
B 1.95 2 
Cc 3.94 4 
D 1.27 1 
a 3.49 3 











TABLE 10—RELATIVE VALUE OF RESULTS 














CONTAINER AVERAGE RELATIVE VALUE 
D 1.27 1.00 
B 1.95 65 
E 3.49 36 
C 3.94 32 
A 4.35 29 
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DISCUSSION 
Group I 


Since Table 4 is a compilation of the preceding three tables 
(1, 2 and 3) ineluding the Best Container, Attention Value 
and Identification, conclusions may be drawn from it for the 
caviar containers. Here we find container C ranks first, with 
B, E, D, F and A following in chronological importance. Con- 
tainer C ranks first in all departments but Identification 
Value, where it took second place. It should be noted at this 
point that E is the container used by almost all the packers 
in this particular line up to the present time. It is interest- 
ing to note the consumers’ ranking for container E. This 
package took third place in the final ranking. In all but one 
test it ranked third. The exception was the test to determine 
the best container in which E ranked second, thus displaying 
either the influence of habit or that the confidence placed in it 
by the packers has not been entirely lacking. Container B, 
which is a barrel-shaped jar, originally inserted in the experi- 
ment because of its ‘‘atmosphere,’’ was selected first for 
Identification Value. Its margin over C is very slight. The 
following facts will more completely show the superiority of 
container C over the rest of the field. Out of the two hundred 
people interviewed, one hundred and fifty-two selected it as 
the best container; one hundred and sixty chose it as having 
the greatest attention value, while one hundred and two gave 
it first ranking in identification value. To sum up more 
clearly, out of six hundred possible first places, C received 
four hundred and fourteen, or over two-thirds. 

Table 5, Relative Values, gives the direct proportion of the 
various containers to the best container. Thus the best pack- 
age, C, is relatively almost twice as good as E, and about four 
times as good as the poorest package, A. This would indicate 
that, providing the quality of merchandise is equally good, 
C will have twice the sales value of the present used container, 
E. With competition keen in this field, the concern using 
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container C will be able to sell relatively more merchandise 
than a competitor, providing other conditions remain con- 
stant. 

Sex differences are negligible in this experiment. The only 
variation is found in Table 3 where the rank order of fourth 
and fifth places is inverted for preferences of women and 
men. 


Group IT 


The results of the experiment for bismarck herring contain- 
ers are analyzed in the same manner as Group I. Table 9 is 
a resume of tables 6, 7 and 8, including Best Container, Atten- 
tion Value and Identification Value respectively. Container 
D received first place in the final ranking, with container B 
second, thus following the separate rankings throughout the 
three tests. The B container is the type of container used 
by a number of concerns at the present time. 

Analyzing Table 6, we find that D received a combined 
average of women and men of 1.44, while B received 1.75 for 
the next best container preference. The results of Tables 7 
and 8 show that the margin of D’s superiority over B in both ' 
Attention and Identification Value is somewhat greater than 
the previous table. The final average of women and men is 
1.27 for D as against 1.95 for B. From the viewpoint of 
Identification Value, as found in Table 8, container D conclu- 
sively displays its superiority. | 

4 





For third, fourth and fifth places, containers E, C and A 
appear to have different advantages respectively according to 
the tests of ‘‘best,’’ most attractive and most easily identified. 
Since these differences fall in low rank choices, they need not . 
concern us and are probably to be explained by the fact that { 
they fall low in individual preferences. ; 

Table 10, Relative Values, gives the direct proportion of 
the various containers to the best container. Thus the best 
package, D, is relatively one and two-thirds as good as B and 
almost four times as good as the poorest container, A. The 
possible conclusion to be drawn from this is that, providing 
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the quality and reputation of the concerns are alike, container 
D will have almost twice the sales value as its nearest com- 
petitor, B. 

Sex differences are negligible in this experiment. The rank 
order of preference for men agree with that of the women. 


CONCLUSIONS 

Modern merchandising takes advantage of certain psycho- 
logical values. In this investigation it was shown that, despite 
the identical net volume of a group of small glass containers, 
there exists a psychological advantage in form and shape due 
to utility, attention value and identification purpose. The 
seemingly lengthy procedure of testing is fully justified when 
one considers and compares the increased possibilities in sales 
and display value gained by the reinforcement of the com- 
modity with the proper glass container. 








A THEORY OF TWO FACTORS: AN ALTERNATIVE 
EXPLANATION 


PART II. SPECIFIC FACTORS* 


HENRY F. ADAMS 
University of Michigan 


Spearman’s theory maintains that the measurement can be 
divided into two factors when, and only when, the tetrad 
difference criterion is satisfied. g is constant for each indi- 
vidual, no matter what the task may be, though more g may 
be used in one task than in another. 

Contrasted with the constancy of g is the variability of s. 
s varies from person to person in the same task, and from task 
to task with the same person. The different s’s are not only 
uncorrelated with each other, but are also uncorrelated with g. 
Spearman’ says, ‘‘Among the practical consequences of this 
complicated and still obscure nature of s is the extreme diffi- 
culty of measuring spectfic aptitudes. For any total ability 
of any person at any stage of growth, a good measurement 
ean be obtained without any difficulty whatever; so much is 
afforded by the corresponding ‘‘test of achievement.’’ And 
after ascertaining this, together with the person’s g, one can 
easily deduce his s as a whole. But for both theoretical and 
practical purposes, we require to eliminate from this whole s 
all that is merely due to some more or less accidental and 
changeable habit of procedure.’’ 

In his mathematical analysis of s, Spearman® treats the s’s 
as though they were errors. If ry, is the correlation coeffi- 
cient obtained from the percent of correct responses in the 


1 Part I, General Factor “G,” appeared in the February, 1931, issue of 
this JOURNAL. 
2The Abilities of Man, page 371. 
8 The Abilities of Man, appendix, pages i and ii. 
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test, then r,,, is likewise the correlation coefficient obtained 
from the per cent of incorrect responses. Since errors of 
response can appear only when some standard of evalution 
exists, it is a logical consequence that the second factor, s, 
owes its existence to the standard. Without the standard, 
the value of s cannot be determined. 

For assigning mathematical values to his various terms, 
Spearman gives the following fundamental equation.* 


Max = Tag Ox + Tas, * Bax 


m,x is the measurement in the test, presumably a percentage 
value expressed as a decimal. fag is the correlation between g 
and the test; g is the amount of the general factor; r,,, is the 
correlation between the specific factor and the score in the 
test. s,, is the value of the specific factor. r,, is obtained 
directly from the table of correlations by the use of the for- 
mula fag= VWTan* Pac/Foee Sx= Max XTag- Yas, = V1.00-r'.. and 
Sax = Max XTas,- The various values for the material offered in 
Table 1 in part I of the paper are presented in the table below. 
The r,, values are given in the table. The m,, values were 
obtained from the table giving the correlation equivalent for 
various percents of unlike signs. The other quantities were 
computed by the formulas given above. 


TABLE Ix 





Max - Tag x g£x + Tas x Sax &x + Sax 








a 
C, 91 96 87 28 25 1.12 
C, 89 94 84 34 30 1,14 
C, 86 91 78 41 35 1.13 
C, 83 86 71 51 42 1.13 
C; 80 80 64 60 48 1.12 
L, 88 93 82 37 33 1.15 
L, 83 .86 71 1 42 1.13 
L, 78 .78 .60 63 49 1.09 
Ll, 75 70 53 71 53 1.06 
Ls 72 64 .46 77 55 1.01 

















4The Abilities of Man, appendix, page xiv. 
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Here there are ten experiments, all involving either dis- 
eriminations of areas of circles or of lengths of lines, and all 
performed by exactly the same subjects. In each of the two 
sets of five, the differences to be discriminated decreased regu- 
larly from C, to C,, and from L, to L;. Furthermore, each 
of the values given above is obtained from the whole group, 
and a group, in an experiment of this kind, is certainly more 
stable than an individual. Yet g, in the Spearman sense, is 
not constant throughout, but, with one exception, varies con- 
siderably from test to test. The only way to make the g con- 
stant is to make it vary with the difficulty so that the product 
of gxsome function of d shall be constant. g, would then 
refer to the result obtained from the application of power 
rather than to the power itself. g, must consequently be a 
result of the application of ‘‘ability’’ rather than the amount 
of ability, if the ability is to be considered constant through- 
out. 

In the earliest account of his theory, Spearman favored the 
use of the square of the correlation as indicating the extent 
to which the measurement was saturated with the general fac- 
tor. He says, ‘‘The influence of an element is measured by 
the square of its correlational value.’”® 

Tag =Taa, the reliability, or the consistency of the individual 
in two or more performances with the same material. If any 
single score is obtained as a result of both skill and chance, 
then, when the test is repeated, those items which were 
answered correctly because of skill should be identical in the 
two performances. Those which were answered correctly by 
chance should show simply chance agreement. In so far as 
there is any agreement the resulting coefficient will be higher 
than pure skill would warrant. Consequently, r’., gives a 
value somewhat higher than pure skill or ability would create. 

In addition to this, the sum of the two factors, in all of the 
above instances, is greater than 1.00. Yet, in a correlational 


5 American Journal of Psychology, 1904, Vol. 15, see pages 75, 273 
and 277. 
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procedure, the sum of the two factors should vary between 
0 and 1.00. The ability can be no more than perfect. 

The alternative explanation to be maintained in this paper 
is an expansion of the point of view presented in part I. It 
will be recalled that for Spearman, g may have any value 
between 0 and 1.00. Our contention was that x (or g) must 
always be equal to 1.00, and that the value of a is indetermi- 
nate, when the tetrad difference criterion is satisfied. The 
lower limit of performance is necessarily zero, for the ability 
shown in any test, and expressed in terms of correlation coeffi- 
cients, must range from 0 to 1.00. The two limits of the scale 
are set. When speaking of ability, we naturally mean exten- 
sion towards the 1.00 or zx limit. On the contrary, when 
disability as revealed by errors is under consideration, the 
standard is reversed and becomes z — 2, or zero. 

We apply the same procedure as previously and determine 
the correlation of the errors by the use of the fundamental 
formula. Since we are now dealing with errors, the formula 
must be changed to read 


Taw = Tacx-x) * Tox-x)- 


Obviously a and b both become 0, and consequently ab is also 
equal to0. eas x-xis0. Consequently, if the ae, be, ce,..., 
ne values are spread along the diagonal of the table, and the 
coordinate ab, ac, be, . . . , np values be determined by multi- 
plication of the quantities in the fundamental equation, the 
whole table can contain nothing but zeros. Since the errors 
must be uncorrelated to satisfy the tetrad difference criterion, 
this result is exactly what must happen. The same conclu- 
sion is reached by the use of the formula 


Tae = Vv Tan ° Tac/Tos 


for ab, ac, and be are all equal to zero. Consequently, the 
correlation of the errors with the zero standard must all be 
zero. If this be true, specific abilities cannot be determined 
from errors, for the correlations must always be zero. 
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The conclusion is that any attempt to isolate two factors, 
such as g and s, from tables of correlations which satisfy the 
tetrad difference criterion, is futile. g (or x) will be equal to 
1, and s (or e) will be equal to 0. The value of a is known 
immediately from the value of ag. The error value of a is 
known immediately from as and is 0. a, consequently, is the 
only useful unknown, and its value can be obtained more 
quickly and easily from the percentage of items correct, or 
from the correlation of the performance with the standard, 
than by the dissection of tables of correlations. The analysis 
of a into its component parts is a matter of very great impor- 
tance, both practically and theoretically, but it does not seem 
that the Spearman technique can accomplish this desirable 
end. 

Two cues concerning the nature of a have been derived from 
the discrimination experiments. First, it has already been 
shown that aa is a measure of the true reliability of the mea- 
surements, that is the consistency of performance in two tests 
with the same material when systematic and constant errors 
are eliminated either by experimental control or by statistical 
treatment. 

Secondly, it has been found that a, or aa, is a function of 
the amount of difference in the discrimination material. If 
the difference which can be discriminated correctly in 75 
per cent of the cases be called a perceptible difference and 
represented by the symbol pd; if the actual physical increment 
of difference in any series of experimental material be called d, 
then the ratio of the one to the other can be expressed either 
as pd/d, or d/pd. If d=pd be taken as the mid point of a 
distribution pd/d values be computed for supra-liminal values 
and d/pd values for sub-liminal values, a scale extending 
equally on both sides of the mid point is obtained. If this 
scale be made the z axis, and the a, or aa, values assigned to 
the y axis, then a and aa values can be plotted against mul- 
tiples or submultiples of the pd values. This procedure is 
adopted to give an arithmetic series on both sides of the pd=d 
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point. If pd/d were used throughout, or d/pd, the progres- 
sion would be arithmetic on one side and geometric on the 
other of the mid-point. 

When the procedure described above is adopted, and the 
aa values plotted against the pd/d or d/pd, a straight line of 
relationship is obtained, extending from a pd/d values equal 
to 1/5 to a d/pd equal approximately to 1/5. aa values lying 
equal distances to right and left of the axis where pd=d, 
when added together, give a value of 1.00. As a next step, 
the x extent can be divided into 100 equal parts, thus forming 
a percentage scale. If 1.00 (100 per cent) on the z axis be 
made coordinate with an aa value of 1.00 on the y axis, then 
any aa value divided by its x percentage equivalent will equal 
1.00. This means that in the graph constructed according to 
the specifications outlined, the reliability varies directly with 
the multiples and submultiples of the perceptible difference. 
aa or reliability consequently becomes an index of the amount 
of the difference to be appraised. From this point of view, aa 
seems to be an index of the coarseness or fineness of the cali- 
bration of the scale. Low aa means fine calibration, high aa 
coarse calibration. 

From a somewhat different point of view, a, or aa, may be 
taken to express the degree of attainability of the standard 
on the part of the individual or group taking the test. Some 
standards are less possible of attainment than others; that is 
to say, some standards are ‘‘higher’’ than others. The higher 
the standard, the lower relatively will be the performance. 
If the same group performs the same discrimination experi- 
ment a second time, or is retested on a spelling test, the aver- 
age score or performance on the two occasions will be approxi- 
mately constant. If, on the other hand, the differences to be 
discriminated be made smaller in the second test, or a more 
difficult list of words be provided, we may equally well expect 
the actual ‘‘power’’ to stay constant even though the scores 
are reduced in amount. 
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The situation may be represented diagrammatically. Let 
x, be the maximum standard which will give an a value of 
1.00. In the discrimination experiments this standard will be 
approximately 5 pd. Let performance p, have an a value of 
.75, Pp, an a value of .50, and p, an a value of .25. By 
hypothesis these different a values indicate a constant dis- 
criminal ‘‘power’’ applied to progressively more difficult 
problems. If the ‘‘power’’ is constant, the situation is 
analogous to that found in measuring work in foot pounds. 
Here w=dl. In the test situation, the performance may be 
thought of as corresponding to the distance in the equation, 
and the altitude of the standard to the load. Since w, by 
hypothesis is constant, it may be made equal to 1.00. The 
altitude of the standard, relative to x,, may be determined 
by solving the equation d=w/l. Then x,=1.00, x,=1.33, 
x, = 2.00, and x,=4.00. Consequently, x, has one and a half 
times the altitude of x,, and one half the altitude of x,. Some 
such treatment as this is necessary if mental ‘‘power”’’ is to ) 
be thought of as a constant. Comparisons in relative ‘‘power’’ 
can then be made in terms of relative altitude of standard in 
the same task or test. If A’s standard is twice as high as | 
B’s on exactly the same test, B may be considered to have 
twice the ‘‘power’’ of A. 

In the situation thus loosely described, two quite different 
concepts can be discerned. One, which may be referred to 
as p, is indicated by the excellence of the particular perform- 
ance, such as a score of 186 on the Army Alpha. P is con- 
sequently observable, objective, and possibly measurable; it 
is the known. But p depends upon the rendering dynamic of 
something which had been potential. This potential some- 
thing, which may be referred to as qg, is an unknown, subjec- 
tive in its nature, not immediately observable and not directly 
measurable. Yet when gq is quickened and is applied to a 
task, a certain result, p, is obtained. Q is certainly different 
from the Spearman g, for the latter is supposed to be entirely 
cognitive, whereas no limits are set to the nature of q merely 
by calling attention to its existence. 
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Ability, in popular writings as well as in others which are 
not so popular, has been identified indifferently with both 
p and q, or sometimes with p and sometimes with g. An 
undistributed middle is the inevitable result. If we are to 
retain the word ability, it must be identified with one or the 
other, or neither. If ability be applied to p, or the excellence 
of performance, we shall probably come close to the popular 
usage. Q must then be made to express in a very figurative 
way mental power. Mental is used advisedly, for it covers 
more than cognition. Power seems to convey the central idea 
somewhat more precisely and exactly than energy. 

The remaining question is what is indicated by the expres- 
sion r,p, Which is equal to rax*Tpx. Now if a is equal to 1.00, 
and b is equal to 1.00, ab obviously becomes 1.00, for each 
ability correlates perfectly with the standard, and by hypoth- 
esis the standards are identical in form. Consequently in 
this limiting case r,, represents the correlation between the 
standards. Now because of random, uncorrelated errors, ax 
is reduced to .90, and bz to .80, ab then becomes equal to .72. 
But when corrected for attenuation by random errors the 
true r,» again proves to be 1.00, the correlation between the 
standards. The uncorrected r,, still represents an approxi- 
mate correlation between the standards, lower than it should 
be because of the random errors which entered into the mea- 
surements. Can the same value, .72 in our illustration, indi- 
cate simultaneously the correlation between two standards and 
the correlation between the abilities of the subjects to approxi- 
mate to those standards? Or must we correct for attenuation 
to determine the correlation between the standards and let 
the raw correlation measure the relationship between the 
abilities? The .72 is still an erroneous correlation between 
standards, lower than it should be because of the inability of 
the subjects. If an index of ability is to be found anywhere, 
it must be in connection with the az and bz values of .90 and 
.80. If ability is defined arbitrarily as (ax)? or (bx)?, then 
the geometric mean of the two will equal ab. But why use 
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the square rather than the first power of the coefficient to 
express ability? If it is done to allow for the effect of chance 
successes on the test, it will sometimes over and sometimes 
under compensate. The correlation between two tests is equal 
to the reliability of a hypothetical third test intermediate 
between the two. Then ability would be shown bv reliability. 
If the first power of az and bz be used instead of their squares, 
then \/ab, or .85 may be taken as the correlation of a hypo- 
thetical intermediate test, c, with the standard. yp, Tac, etc., 
values appear to express the relationship between standards 
rather than the relationships between abilities. Possibly the 
correlation between abilities is equal to the correlation between 
standards. 

Exactly the same point is brought out in personality rat- 
ings. If the correlation between two traits is .50, and the 
reliabilities are .48 and .52 respectively, then the corrected 
coefficient is 1.00, showing that the two standards of evalua- 
tion are identical in form. If the raw correlation proves to 
be .25, with the reliabilities the same as before, the corrected 
coefficient is .50. This figure shows the correlation between 
the standards used in evaluating the persons, the reliabilities 
are an index of the amount of the differences in those judged. 
As such they may be used to determine the ability of the 
raters to rate. But do they tell anything about the ability 
of those rated to approximate to the standards in terms of 
which they are evaluated? Apparently not. If this be so, 
then traits are not correlated with traits, nor abilities with 
abilities, nor abilities with traits. Standards only are corre- 
lated by the usual techniques, and then only imperfectly. 


PART III. 


Group Factors 


That the tetrad differences criterion does not hold for tables 
of correlations when the measurements from which they are 
derived are obtained under certain conditions has been 
brought out by Spearman. Tables of intercorrelations be- 
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tween physical measurements do not tend to assume the tetrad 
form. In the mental test situation, also, it has been found 
that when tasks are too similar, the tetrad form is violated. 
The main eause, however, is assumed to be correlated errors. 
The correlated errors, or overlapping specific abilities super- 
imposed upon the r,, values, give a total result which is too 
high to fit into the tetrad table. But it is only where corre- 
lated errors appear that such inflation follows and the amount 
of inflation varies with the amount of the correlation between 
the errors. Consequently, a table showing such intercorrela- 
tions will be very irregular instead of nicely graduated as is 
the tetrad table. 

Spearman and his followers have succeeded in searching out 
certain types of response to test situations which reveal group 
factors. Analysis of these shows three main trends which 
spoil the tetrad form of table. They are as follows: 

A. Systematic errors, or the tendency for the person to repeat 

an error. 


B. Constant errors, or the tendency of a group to share the same 
error. 


C. Different standards used in the reactions, as in the correlations 
of height and weight, or as in personality ratings where the 
halo affects the judgment. 


The alternative explanation of group factors offered in this 
paper is in favor of C above. It is to the effect that the table 
of correlations would always have the tetrad form if a single 
standard were used throughout. When one standard is used 
in one test and a different standard in the other, as in height 
and weight, or as in intelligence and perseveration, the table 
of correlations will not be of the tetrad form. Also, when 
two standards are used alternately in the same test, one de- 
pending upon judgment and the other upon memory, or upon 
emotion, the resulting table will again not be of the tetrad 
form. 

It is further maintained that both systematic and constant 
errors are convenient classificatory terms of a mathematical 
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sort, but that they do not do justice to the mental content of 
the human being taking the test. For him the responses are 
either random or purposeful. If random the ‘‘errors’’ will 
be uncorrelated. But in a test the response is ordinarily made 
in terms either of judgment or memory. Whichever of these 
two functions appears to the subject to be easier in the par- 
ticular case will be favored. In an experiment on the dis- 
crimination of weights which was repeated 20 times on each 
subject the reaction was made in terms of judgment when the 
discrimination was easy, whereas when the discriminations 
became increasingly difficult, memory was used more and more 
and judgment less and less. When judgment was used the 
criterion was the correct order of weights. But when memory 
was substituted the criterion was agreement with the previous 
performances, regardless of whether they were right or wrong 
from the standpoint of judgment. If the performance is 
evaluated in terms of judgment, many of the reactions made 
in terms of memory will be consistently wrong, and will be 
called systematic errors. Consequently, systematic errors 
appear in the main when one standard is used in evaluating 
a multistandard performance. 

With constant errors the picture is similar except that a 
standard which is erroneous in part or in toto, is substituted 
by the subjects for the true standard, while the evaluation is 
made in terms of the true standard. 

The conclusion is that the so-called systematic and constant 
errors are traceable directly to either the mixed standard, the 
multistandard, or the false standard situation. The discovery 
has been recently made that when a single standard has been 
used throughout so that the errors are random, then self- 
consistency equals the square of the correlation of the per- 
formances with the standard. Expressed in the usual sym- 
bols, Tsc=Tgc=T7ax- When so-called systematic errors are 
present, Isc > Tec, but Tec=F*ax. When so-called constant 
errors are present, [sc >Tec ANd gc > Pax. Ta, Pemains 
approximately constant throughout and consequently is the 
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most useful term available in computing correlation coeffi- 
cients. 

If the equation ab = az x bx holds when only random errors 
are present, then any change in the experimental conditions 
which will permit either the use of the multistandard or the 
false standard will create correlated ‘‘errors,’’ and the funda- 
mental equation must therefore be modified if it is to account 
for the obtained coefficient. 

As illustrations of what happens to produce systematic 
‘‘errors,’’ the following cases of retesting may be cited. In 
the original test it may be legitimately assumed that skill or 
judgment is responsible for the excellence of the performance. 
On the retest with precisely the same material it is equally 
legitimate to assume, if there is any truth in the laws of learn- 
ing and habit formation, that, superimposed upon the judg- 
ments, will be a tendency for the second test performance to 
resemble the first more closely than a combination of skill and 
chance would admit. In some extreme cases this too close 
resemblance would be due to memory of the previous reaction ; 
at the opposite extreme might be those obscure cases of asso- 
ciative connection familiar to all students of animal behavior. 
The multistandard can be used, then, only in the second or 
subsequent tests.° 

When this happens the obtained coefficient is inflated be- 
yond what it should be—is really the sum of the true correla- 
tion plus the spurious correlation derived from the correlated 
‘‘errors.’’ The addition of the memory standard can affect 
the bz value to no appreciable extent; if memory were per- 
fect, it would obviously have no effect, for in that case br 
would be equal to az. So we may say that the true correlation 
is given, when the standards are identical in form, by the 
formula ab=axxbz. This ab value is identical with that 
obtained when the performances of the different subjects are 

®It may of course appear in the first if items involving the same prin- 


ciple are given a number of times, or if exactly the same elements are 
repeated in different contexts. 
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intercorrelated one with another, and the average taken. This 
value has already been referred to as group consistency, or 
gc. The same result could be obtained by matching the sub- 
jects in ability on the first test, and correlating the perform- 
ance of one member of each pair against the performance of 
the other member of each pair. Then we should get the corre- 
lation undisturbed by memory, or whatever it is that produces 
the systematic errors. When the ‘‘reliability,’’ or self corre- 
lation is computed, the systematic ‘‘errors’’ get in their deadly 
work. 

Of course, memory cannot play its réle unless there is some 
cue, as a symbol or a meaning, to which it can attach. But 
in mental tests there are always meanings. It is consequently 
safe to hazard the generalization that the so-called ‘‘reliabili- 
ties’’ of mental tests tend to be universally higher than they 
should be, when derived from a retest on the same or very 
similar material. Presumably a perfectly good curve of for- 
getting could be obtained from computing the reliabilities for 
different intervals between test and retest. 

The situation may be made clearer by a few concrete ex- 
amples. Test 8 in the various forms of the army alpha was 
modified to become a hundred item true-false test. It was 
given to a group of 78 students and one week later the same 
group was retested with exactly the same material. The 
resulting self-consistency, or spurious reliability, was .93. 
The true reliability, or true ry», was .70. The rx, or correla- 
tion between the performance and the standard was equal to 
.84, and r,, was also equal to .84. Now, .84x .84=.70, so the 
true correlation is given by rax XTpx=Tan. But to account for 
the .93, something must be done to r,,. ax must keep its value 
of .84, for here memory of responses in the first test has had 
no chance to play its part. bz must be increased until 
ar x ba x y= .93 x y = 1.32. 

Take another case, this time obtained from personality rat- 
ings. The self consistency equals .85; the group consistency 
42; raz equals .64 and r,,=.66. Now .66x.64=.42, and 
.66 (.64 x 2.02) =.85. 
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A third situation may also be taken from personality rat- 
ings, for example let us take two traits which correlate, when 
corrected for attenuation, by 1.00, showing that the same 
standard is involved in both. The group consistency on the 
intercomparison of the two is again .42; r,, is .64 and r,, is .66. 
But the self consistency has now dropped to .58. In this 
ease, .58 = .64 (.66 x 1.37). 

The two illustrations obtained from ratings differ in one 
essential respect. In the first there is a real case of retest, for 
precisely the same task was repeated. The systematic errors 
discovered are probably due to two main causes, memory of 
what was put down on the first rating, together with the gen- 
eral attitude towards the person rated. As a result, r,, had 
to be multiplied by 2.02 to correct for both factors. In the 
second rating evaluations were made on two different traits. 
In turning from the first to the second trait term it seems 
probable that memory would be largely if not entirely elimi- 
nated, but that the general attitude towards the persons rated, 
the halo effect, would persist. In this case r,, had to be 
multiplied by only 1.37. In other words, both factors cause 
Tpx to be doubled, the halo factor causes r,, to be increased by 
about 1/3. Memory then is 2/3 responsible for the systematic 
errors, the halo is 1/3 responsible. When both the memory 
factor and the halo factor are eliminated, ab = az x br. 

As an illustration of how the theory applies in practical 
situations we shall present a table of correlations taken from 
Schneck.? This includes the intercorrelations of 5 linguistic 
and 4 mathematical tests. The reliabilities are given but are 
almost certainly higher than they should be. It is also quite 
probable that the intercorrelations of the linguistic with the 
linguistic and of the mathematical with the mathematical will 
be inflated by systematic errors due to identity of items or 
principles. But in the correlations between the linguistic and 
the mathematical tests no appreciable overlap should occur. 


7 Archives of Psychol. 192, No. 107, p. 22 





372 HENRY F. ADAMS 


These 20 correlations may consequently be used to determine 
the r,, values which are entered along the diagonal of the 
table. More accurate figures could undoubtedly be obtained 
from the scores made on the tests; those given are to be con- 
sidered as approximations sufficiently close to the reality to 
substantiate the method. The figures in the remainder of the 
table are those given by Schneck. The present writer has put 
in parentheses the theoretical values obtained by the use of 
the fundamental formula. The data are presented in Table 
XI. 








TABLE XI 
A B Cc D E F G H I 
A (75) 
B 88 (85) 
(64) 
C 69 # #7 (75) 
(56) (64) 


D 54 54 «45 (99) 
(74) (84) (74) 

E23 6 2 2 (39) 
(29) (33) (29) (39) 


rns 2s ss s 12 (33) 
(25) (28) (25) (33) (13) 
om se? fe 04 45 (26) 
(19) (22) (19) (26) (10) (09) 
H 05 05 #05 18 07 38 25 (06) 
(05) (05) (05) (06) (06) (02) (02) 
I 06 05 04 . —— 2 2 27 = (06) 


(05) (05) (05) (06) (02) (02) (02) (00) 





The mean discrepancy in the linguistic-linguistic tests is 
.147 ; in the mathematical-mathematical, .310 ; in the linguistic- 
mathematical .031 + .045. If the linguistic-mathematical cor- 
relations are reasonably free from systematic errors, the other 
two parts of the table are seen to be well saturated with them, 
the mathematical more than the linguistic, possibly because of 
their greater difficulty. The most amazing feature is revealed 
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by a comparison of the theoretical r,, values with the empirical 
values. These are given in table XII. 


TABLE XII 








THEORETICAL EMPIRICAL 

A 75 95 
B 85 96 
C 75 94 
D 99 96 
ry) 39 87 
F 33 96 
G 26 95 
H 06 96 
I 


| 


06 98 





This table emphasizes in a very striking way how subjective 
mental tests are likely to be in spite of their alleged objec- 
tivity. 

Of more immediate interest, however, is the value of y in 
the equation @Doptainea=ar (ba xy). It proves to be as fol- 
lows in the various correlations. 


In ab x y=1.38 ce xy= 1.00 
ac Xx y=1.23 dexy=_ .67 
ad xy= .73 fgxy= 3.87 
ae xy=1.02 fh x y = 19.30 
be xy=1.13 fi x y=18.80 
bd xy= .64 ghx y = 16.02 
be xy= .79 gi x y= 19.80 
ed xy= .61 hi x y x 75.00 


This table shows very clearly that approximate values for y 
can be very easily obtained. They are found to vary in a 
very indiscriminate manner, and as a result make the table 
very spotted. The table also shows with equal clarity that 
the value of any coefficient of correlation can be computed 
from the equation ab=axz (bxaxy). When y equals 1.00 the 
standards are identical in form and random errors only are 
present. 
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When constant errors are superposed upon the random and 
systematic, a further change is reflected in the value of the 
correlation coefficient. It may be either inflated or deflated, 
depending upon the attendant circumstances. In any case, 
the judgment is made in terms of one standard, the evaluation 
in terms of a different one. For example, if the set up of the 
experiment is such as to create illusions of size of varying 
amounts, then the judgment will be in terms of the apparent 
size, and the evaluation in terms of the real size. The absolute 
accuracy will be smaller than when no constant error is pres- 
ent; it will be smaller than accuracy figures derived from 
apparent size. Hence, correlations between two items affected 
by such a constant error will be higher than the true correla- 
tion when item to item comparison is used, but lower when 
end score is used. 

In certain of the discrimination experiments cited above 
three were found to involve the same constant error. When 
systematic errors had been eliminated the following results 
were obtained. True rax, Tox, and r., values are given along 
the diagonal of the table. The theoretical figures appear in 
parentheses under the empirically obtained figures. 








TABLE XIII 
A B C 
A .76 


B. 68 .%6 
(.58) 

oa ° 
(.55) (55) 


~) 
bo 





Correction for attenuation gives .89, which is apparently 
the correlation between the standards. 

Lastly, we come to the case in which measurements having 
very different standards are correlated, as in the ‘‘measure- 
ment’’ of personality. Samples from Webb’s research will 
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be used to make the point clear.* Traits 21 and 34, conscien- 
tiousness and tendency not to abandon tasks from mere 
changeability, have a raw correlation of .52. Their respec- 
tive reliabilities are .53 and .52. Correction for attenuation 
will give approximately 1.00. This shows that the standards 
in the two cases are identical in form, however different they 
may be in content. 

Now consider traits 5 and 34, 5 being readiness to recover 
from anger. The raw correlation is .31, the reliabilities .55 
and .52 respectively. Correction for attenuation now gives 
.58, showing that the standards are by no means identical in 
form. The correlation which would have been obtained if the 
standards had been identical can be found by the use of the 
proportion .31:.58=x:1.00, in which case x proves to be 
equal to .5/4. And .534 is the geometric mean of the reliabili- 
ties. 

As a third case, we may take the correlation between the 
traits 1 and 34, 1 being cheerfulness. The coefficient proves 
to be .06. The reliabilities are .62 and .52 respectively. The 
correlation between the standards is consequently .105. Using 
the proportion again to find out what the correlation would 
be if the standards were identical, we have .06: .105=x: 1.00. 
x consequently equals .57, which is the geometric mean of the 
reliabilities. When the standards are made identical, the 
correlation is the geometric mean of the reliabilities. 

How, then, are coefficients of correlation to be interpreted 
when the standards are different in form? In the last case, is 
.06 the true index, or is it .57? Certainly the value selected 
would make a difference in any conclusions drawn. The two 
interpretations are by no means comparable. Yet it is not 
unusual to find correlation coefficients derived from many 
relations of standards existing side by side in the same table 
and interpreted, all of them, on the same basis. Ordinary 
common sense should tell us that they must somehow be made 
comparable before comparisons are made. 


8 Webb, British Journal of Psychology, Monog. Supp. 1915. 
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When the standards are made identical in form, the correla- 
tions follow the fundamental formula, ray=\/Taa- Tov = 


Tax‘ Tpx. If, as was indicated in Section II, \/ “vis V To vary 
with the discernibility of the differences to be discriminated, 
then the rap, Tac) Tne Values give no new information. What- 
ever they reflect is but a blurred image of what is indicated 
more clearly by the immediately obtained r,,, rx, and rx 
values, ra, indicates only the relative discernibility of a hypo- 
thetical trait midway between a and b. And this value can be 
obtained without going to the trouble of working out the co- 
efficients of correlation. 

When the standards are thus made identical in form, the 
table of correlations will assume the tetrad form. 


SUMMARY OF POSITIVE FINDINGS 


1. When the standards are identical in form, the correlation 
table derived from measurements in terms of these standards 
and none other will be tetrad in form. 

2. When the standards are made identical in form by mathe- 
matical treatment, correlation tables derived therefrom will 
be tetrad in form. 

3. When the tables are tetrad in form any correlation in 
the table can be obtained from the use of the fundamental 
formula ry, = Vas V0 = Tas ° Tox: 

4. V Tas is an index of the amount of the difference to be 
judged, and as such may be related to ability in the sense of 
concrete accomplishment. 

5. When the standards are not identical in form, r, be- 
comes a rough index of the correlation between the standards. 
Its value is lower than it should be because of the presence 
of random errors. When the random errors are eliminated 
by correction for attenuation, then r,, shows the true correla- 
tion between standards. 

6. Partialling out the standard, when the standards are 
identical in form, makes r,, equal to zero, showing that what- 
ever correlation exists between a and b is attributable to the 
standard. 
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SUMMARY OF THE ALTERNATIVE THEORIES 


1. Spearman believes in the existence of a general factor, 
g, or mental energy, which is responsible in part at least for 
the correlation between tests. 

The writer has shown that g may be identified with a more 
inclusive factor, z, the standard in terms of which the mea- 
surement or evaluation is made. 

2. For Spearman, the value of g may vary between 0 and 
1.00. For the writer, z is always equal to 1.00. 

3. Spearman’s specific factor, s, is derived from errors, and 
may have any value from 0 to 1.00, provided only that 
rag+F*4s=1.00. The sum of g and s may, however, be 
greater than 1.00. The s’s are uncorrelated with each other 
and with g. 

For the writer, e is also derived from errors and has always 
the value of 0. x+e must always equal exactly 1.00. The 
errors are uncorrelated with each other. Having a constant 
value of 0, they are uncorrelated with z.° The r,. values must 
consequently be 0 also. 

4. Group factors, for Spearman, result from overlapping 
s’s, or correlated errors. They are named in terms of the 
situations in which they are prone to occur, without effort to 
trace them to any more fundamental source. 

For the writer, correlated errors appear only when two or 
more standards are being used. The experimental set up 
presupposes that one standard be used throughout but permits 
the employment of one or more additional standards, only to 
evaluate the whole performance in terms of the expected 
standard. Memory is presumably the cause of many of the 
correlated errors in the usual test procedures. 

5. Spearman apparently believes that the correlations may 
indicate the relations between abilities. 

The writer has furnished evidence suggesting that the corre- 
lations indicate the relations between the standards. 

® Uncorrelated may have either of two meanings: first, a possibility 


of correlation but a coefficient of 0; second, no possibility of correlation, 
and consequently no coefficient. Here, the latter meaning is used. 








THE RELATIONSHIP BETWEEN CHARACTER TRAIT 
RATINGS AND CERTAIN MENTAL ABILITIES. 


K. C. GARRISON and SUE CRAFT HOWELL 
North Carolina State College 


The measurement of traits of personality, of which charac- 
ter is a component, is probably receiving more attention now 
than any other single problem in psychology. The first of 
these traits attempted to be measured in an objective manner 
was intelligence; but it was soon discovered that intelligence 
was not the only one to be considered, despite the fact that it 
is a very important one. Within the past several years 
numerous attempts have been made to measure traits other 
than intelligence. The recent trends of thought based in the 
main on the scientific work in this field, relates these traits 
into a unified whole in the personality make-up rather than 
isolate them as so many separate phases of the individual’s 
personality. 

Recent investigations have added weight to Terman’s pre- 
diction in 1927, that the problem of diagnosing anti-social 
tendencies is as capable of scientific solution as is the problem 
of the measurement of such a complex function as intelli- 
gence (13). The amount of work being done in the evaluation 
and measurement of various character traits is evidenced 
from the yearly summaries in the Psychological Bulletin, 
by May and Hartshorne, and others. Another helpful sum- 
mary of investigations in the study of character is that by 
Sister Rosa MecDonouge (11). An examination of these sum- 
maries will not only give one a better understanding of the 
specific techniques being used, but also a realization of the 
immensity of the task at hand. 

The rating technique has been devised io facilitate the 
judging of one person by another, by providing the judge 
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with a list of traits and have him state possible gradations 
within each trait. The improvements that have come about 
in studying character by this cechnique are due to several 
modifications in the procedures. These changed procedures 
are of the following types: (1) The character area or trait 
being rated is more objectively defined for the different grada- 
tions; (2) efforts are made to eliminate the ‘“‘halo’’ effect; 
(3) ratings are to be obtained by several judges; and (4) all 
ratings are to be based upon observations of conduct. 

Tn the present investigation a graphic rating scale was de- 
‘ised and an effort made to embody such procedures as have 
been found to increase the reliability and validity of such a 
scale. The illustration here in the case of promptness will 
show the general technique involved. 


PROMPTNESS 





Never prompt in Prompt if task Always prompt to 
any activity un- can be performed perform assigned 
less forced. with little diffi- or assumed tasks. 


culty, or tends to 
serve him well. 

This study does not attempt to set up new methods for rat- 
ing character; but it does attempt as its primary purpose to 
reveal as far as possible by means of improved techniques al- 
ready in use what relation exists between character ratings 
and certain mental abilities of pupils. For the purpose of 
this study the term ‘‘character’’ will be used to indicate the 
subject matter under investigation. The items isolated for 
this study are: 

1. Specific character traits. 

2. Certain mental abilities; namely, intelligence, vocabu- 
lary, and scholarship. 

When attempting to solve the main problem of this study 
certain subsidiary problems arose, which were: 

1. To what extent are the teachers’ ratings reliable? 

2. What relationship exists between the results obtained 


from the three measures for honesty, namely: (a) Teachers’ 
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judgments, (b) pupils’ judgments, and (c) the subjects’ re- 
actions to an objective test? 

3. What relation exists between the social and tempera- 
ment traits and those classified as mental and activity? 

4. Which traits are rated highest and which lowest? 

Nineteen character traits were selected for study. After 
a rather careful analysis of the meaning of these various 
traits, an arbitrary division of these was made. Eight traits 
were placed in the division referred to as Mental and Actwity 
Traits, and the other eleven traits were classified as Social 
and Temperament Traits. The eight Mental and Activity 
Traits are—persistence, promptness, orderliness, attention, 
executive ability, initiative, quickness of thought, and 
judgment; the eleven Social and Temperament Traits are— 
courtesy, sociability, self-control, kindness, unsclfishness, re- 
liability, honesty, intellectual modesty, stability, cheerfulness, 
and emotional self-control. Although this division was sub- 
jectively arrived at by the pooled judgment of a graduate 
class in educational psychology consisting of fourteen mem- 
bers, it is believed by the investigators to be a rather fairly 
accurate division of such traits, although in tne final analysis 
these will reveal a great deal of overlapping. 

For this study sixty-two eighth grade pupils from the high 
school of Raleigh, North Carolina, were selected. The school 
from which the subjects were taken is somewhat unique. In 
it the activity program is greatly stressed. A school of this 
type has as one of its main objectives character training and 
emphasizes social training, which is perhaps badly neglected 
in schools of the more formal type. 

Three raters were chosen from the high school faculty. 
Each of these raters had had the subjects for at least five 
recitation periods per week for half the school term of nine 
months. The raters were asked not to consult each other con- 
cerning the ratings given to a particular subject. These scales 
were in the hands of the raters for eight weeks; thus ample 
time was given for deliberation in judging. 
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Cady’s (2) investigation showed the value of a period of 
observation prior to rating; the reliability coefficient of rat- 
ings without an observation was .63 and with observation .87. 
Slawson (12) found that one means of reducing differences 
is to bring the traits to be rated to the attention of the raters 
through a period of time. 

Other data available on each of the sixty-two subjects were: 
(1) results on the Terman Group Test of Intelligence, Form 
A; (2) sentence vocabulary test scores designed by Garrison’ 
after the Holley Sentence Vocabulary Test; (3) scholarship 
determined by averaging the fall term grades in all school 
subjects; (4) combined ratings of each pupil by all other 
pupils on the trait for honesty; and (5) objective test data 
on honesty. 

These data on honesty were gathered in order to check the 
teachers’ ratings against the pupils’ ratings, and to make com- 
parisons between the results from this study and results ob- 
tained from previous investigations. The objective test used 
for honesty was the C. E. I. Spelling Test devised by Hart- 
shorne and May in connection with the Character Education 
Inquiry, sponsored by the Institute of Educational Research 
Bureau, Teachers College, Columbia University (4). 

These spelling tests do not propose to measure the complex 
character trait of honesty from every angle; but under the 
constant stimulus of classroom urge one may assume that they 
measure to some degree the same general phase of honesty ob- 
served by the teachers and fellow pupils of the various sub- 
jects. 

Table I shows the correlations between the average rating 
given by teachers, by pupils, and by ‘‘Corrections of Mis- 
spellings.’’ 

The correlation of .71 + .04 between the combined ratings 
of three teachers and those of pupils is significant in that it 


1 Garrison, K. C. ‘‘Correlations between Three Different Vocabulary 
Abilities.’’ J. Ed. Res., 1930, 21, 43-45. 
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TABLE I 
Correlations Between Various Measures for Honesty 








Ratings by Teachers and Ratings by Pupils... 0.0 71 .04 
Ratings by Teachers and Objective Results from Correc- 

tions of Misspellings i 05 + .09 
Ratings by Pupils and Objective Results from Corrections 

Of Misspellings ....ncccccccccssneneeen ch teirenienistaeetnne 02 + .09 








shows that the teacher and pupil raters are evidently measur- 
ing to a large extent the same thing, whether it be conduct, 
halo effect, or some special aspect of the subjects’ observed 
behavior. The two are so closely related in their ratings that 
one would conclude that some common aspect of the subjects’ 
behavior must be present both as this is observed by other 
pupils as well as by the teachers, or that a known reputation 
of the child affected the ratings of both groups. The low cor- 
relation of the ratings for honesty, or rather dishonesty, for it 
was the negative side which the test measured, with the scores 
given by the objective measure is interesting because it in- 
dicates that the phase of honesty measured by the test does not 
have elements in common with the phase of behavior observed 
by pupils and teachers. This result is to be expected because 
deceit in itself is a trait that the child tends to cover up, thus 
the inner self is not expressed in many cases in observable be- 
havior. Therefore, if the child is successful in cheating in 
classroom there is a likelihood that he will be successful in de- 
ceiving his classmates and teachers. Furthermore, in this 
case the honesty test covered only one type of classroom hon- 
esty at a particular time under given circumstances whereas 
the ratings were based upon repeated observations of the sub- 
jects’ conduct. 

Some traits, persistence and self-control, for example, mani- 
fest themselves more accurately in the subjects’ overt be- 
havior. A very careful investigation made by Hartshorne 
and May (5) confirms this statement. For honesty, they found 
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a correlation between conduct and ratings of .35; for self- 
control .80; and for persistence .52. 

Table I shows even less relation between ratings and con- 
duct for honesty than Hartshorne and May’s investigations. 
This is probably due in part to the limited number of ob- 
jective tests used in testing honesty in this study. 

The teachers’ judgments on all traits are not checked 
against pupil ratings for such a procedure although valuable 
was not deemed necessary in fulfilling the general aim of this 
study. The reliability of teachers’ judgments on the other 
eighteen traits may be reckoned in part from a study of the 
agreement of the raters. Table II shows the group consis- 
tency of the judges. The agreement of the raters is very 
striking in this table, the greatest average variability being 
only .5. On only one trait is the variability this large; the 
total average-variability for both scales is only .3. This is 
a very significant result in view of the fact that the raters had 
a possibility of varying from a rating of one to a rating of five. 

Another interesting feature of Table II is the fact that 
each rater is so close to the average with respect to the con- 
sistency in rating a given trait. In the light of these results, 
the teachers’ ratings can be said to be reliable, the reliability 
being dependent upon the group consistency of the raters. 

Any measuring instrument is reliable just to the extent to 
which it always gives nearly the same or at icast consistent re- 
sults (1). If a seale in the hands of one registers high, but 
in the hands of another low, the reliability is slight. In the 
field of character measurement we shall never be able to de- 
vise a scale that will possess 100 per cent. reliability. Even 
if the scale were perfect the traits of personality are variable. 
Although variability exists, in all probability a trait would 
tend to fluctuate about a rather constant point in its constant 
manifestations. When ratings are made from year to year, 
84.6 per cent. of students remain within one-half step from 
the initial ratings according to the studies of Hughes (6). 
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TABLD II 
Variability Table Showing Group Consistence of Each Rater 





SOCIAL AND TEMPERAMENT TRAITS 
































Variability from Average 
TRAIT AVERAGE 
Rater 1 Rater 2 Rater 3 VaRIABILITY® 
CIE eiciecacterenttigigoene 3 2 2 2 
ID seteicecicntencenenicteonntn 3 3 3 3 
Self-control 4 3 3 3 7 
be Ne tr 3 2 3 Be} & 
Unselfishness ...................... 4 2 3 3 
Reliability . 4 8B 8 3 3 
ee 2 2 3 2 
Intellectual Modesty ........ 3 oh 2 2 
SNE serpciichiiticimntintoanadinns 5 4 4 5 4 
Cheerfulmess .....ccccccccoooesseee 5 2 A 3 - 
Emotional Self-control 4 3 3 3 j 
Average of the average variability 29 





MENTAL AND ACTIVITY TRAITS 
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Variability from Average 4 

TRAIT AVERAGE 5 

Rater1 | Rater 2 | Rater 3. VaRLABILSTE® : 

Persistence ~.............. elle he eee 3 3 3 3 

Prompts ............... 3 2 3 3 E 

hh ea ee ee 2 2 3 2 { 

Attention pe ae ae 3 3 3 3 E: 
Executive Ability ....... 3 3 3 4 
Initiative . Scicticdeiiiaiteln 4 a 5 4 
Quickness of Thought .......... 3 2 4 3 
SINE eincccrtninenenibionitions 4 3 4 3 
Average of the average variability....... PE Selle LEN lata 31 





* The average variability was computed to the nearest one-tenth. 


The validity of a rating scale depends upon the extent to 
which the scale actually measures that which it purports to 
measure. Due to the fact that validity in the final analysis 
has a subjective reference, one cannot say with certainty to 
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what extent any measure is actually measuring the trait that 
it is intended to measure. The standard set up by which the 
validity of a test is to be checked has in itself certain sub- 
jective references. The ratings here introduced are based 
upon observations and might be thought of as measures for 
reputation rather than character. The close agreement of the 
raters will lead one to conclude that these raters agree fairly 
well upon the subjects’ reputation, character, or observable 
conduct as the case may be. Table III will give some basis 
for judging the validity of three character traits chosen from 
the mental and activity traits as quite important and repre- 
sentative of these traits. 


TABLE III 
Cor.elation between Certain Mental and Activity Traits and Three 
Criteria for Higher Mental Ability 





AERO, 
CHARACTER TRAITS INTELLIGENCE VOCABULARY | scHOLARSHIP 

















ABILITY 
Persistence jaiadiniagaeialailh ag + .08 F 34 + .08 73 + .04 ; 
Attention. ........ aniciieitoeees .08 + .09 15 + .08 74 + .04 
Initiative EET ALES: 18 + .08 43 + .07 .80 + .03 
Total Mental and Activity 
ES SS 46 + .07 52 + .06 73 + .04 














The figures in the column under intelligence show little re- 
lation between the particular traits mentioned and the sub- 
jects’ intelligence quotients. The low correlations between 
persistence and intelligence, attention and intelligence, and 
initiative and intelligence, is probably due to the fact that the 
most intelligent are not compelled to persevere and attend to 
the various school tasks in order to grasp the thought to the 
same extent that some of the less intelligent would. Many 
of the more intelligent pupils thus come to develop undesirable 
mental and activity habits relating to their school tasks. In 
the groups studied the Intelligence Quotients ranged from 70 
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to 145. It is interesting to note the increase in correlation be- 
tween intelligence and the total mental and activity traits over 
the single traits referred to in Table III. This is due to the 
fact that traits other than persistence, attention and initiative 
are more closely related to intelligence, as well as to the addi- 
tion of common factors as more traits slightly related to in- 
telligence are introduced. Hughes (6) found from twelve 
selected traits quickness of thought correlated the highest with 
intelligence. This same study yields a low correlation for 
persistence, attention and initiative. 

As we turn from the correlation with intelligence to those 
with vocabulary ability and with scholarship, higher correla- 
tions are found, the highest being with scholarship. It is 
quite possible that these high correlations with scholarship are 
due in part to the fact that the same factors that cause the 
teachers to rate sudents high in scholarship are operating to 
rate them high on the character rating scale. This assump- 
tion is no doubt in part true. 


TABLE IV 


Correlation between Certain Social and Temperament Traits and Certain 
Mental Abilities 








CHARACTER TRAITS INTELLIGENCE VOCABULARY | scHOLARSHIP 
ABILITY 
| | a eae Rie 27 + .08 52 + .06 
I  sscciscstiteeternsecuretloons i .29 + .08 .25 + .08 40 + .07 
Emotional Self-control ........ .05 = .09 .06 + .09 13 + .08 
Total Social and Activity 
Temperament Traits... .39 + .07 42 + .07 .64 + .05 














Since the social and temperament traits are dependent pri- 
marily upon the individuals’ social experiences rather than 
based so much on learning ability and various other mental 
processes, one would expect to find lower correlations with 
the measures of higher mental ability. The correlations 
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presented in Table IV confirm this assumption. The three 
social traits listed were selected as somewhat representa- 
tive of the traits taken from the eleven traits of the Social 
and Temperament Scale. These traits are quite different 
in nature, and thus represent different phases of the sub- 
jects’ social and temperament character. Again, one finds 
low and insignificant correlations with intelligence; but the 
correlations with vocabulary ability and scholarship are higher 
and consistently positive. The high correlation between 
sociability and scholarship is rather hard to explain and is 
possibly due in part to the special peculiarity of the group 
studied. Ina special group of this type where socialized work 
is promoted and a more informal type of activity is adhered to 
the social and temperament traits are developed so that this 
probably plays a part either in the grading of the pupils or in 
the building up of specific types of character traits in the 
subjects concerned. 

When the social traits are taken as a composite group and 
correlated with the three mental abilities under consideration, 
results similar to those with respect to mental and activity 
traits are found. The relationship tends to increase rapidly 
but it never reaches the degree that the mental and activity 
traits do. Even this is to be expected, as was brought out 
earlier, the social traits are not as important in the general 
make-up of the subjects’ mental abilities. 

As a matter of incidental interest, an analysis was made of 
the various phases of the measuring scale. A significant fact 
is that the two scales despite the difference in the nature of the 
traits listed, are closely related. This study does not provide 
for determining to what extent each trait is related to the 
others, but taking the two scales as composite wholes their cor- 
relation was found to be .67 + .05. If a student has a high 
total average in social and temperament traits, he will quite 
likely rate above the average in mental and activity traits. <A _ 
further analysis of the two scales reveals that this particular 
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group was rated higher on the average in the social and 
temperament traits than on the mental and activity traits. 
One cannot generalize from this that such would be the case 
with all groups; for, as it has already been pointed out, in 
this special group the social and temperament traits are de- 
veloped and considered in a larger degree than is the usual 
ease. It is interesting to note how closely the traits of like 
nature on the scales were ranked. For instance, kindness, 
unselfishness, and intellectual modesty, traits of a similar 
nature though distinct from each other, were rated on the 
average 4. 

The results from this study may be summarized as follows: 

1. The results from the single objective measure of honesty 
show that it does not measure the same phase of the trait that 
the teacher and pupil raters do. 

2. The ratings are somewhat consistent when three judges, 
with a fair degree of familiarity with the subjects, use the 
character rating scale devised for this study. 

3. Character traits taken separately as a rule do not show 
close relationship with intelligence and vocabulary ability, 
but this correlation is considerably higher when the total 
ratings are considered. 

4. Positive and reliable correlations were found between 
the various character traits and scholarship. 

5. Mental and actwity traits correlate higher than social 
and temperament traits with the various measures of mental 
ability. This is to be expected though, since the mental and 
activity traits as a whole are important elements of mental 
ability. 

6. A correlation of .67 + .05 was found between the total 
ratings on the mental and activity traits and those on the 
social and temperament traits. 

7. Of all the nineteen traits listed stability was judged with 
greatest variability. This is an indication of either difficulty 
of judging this trait on the basis of observation or general 
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reputation, or lack of knowledge concerning the existence of 
such elements as make up this trait. 


8. The subjects of this study were rated higher on the social 


and temperament traits than on the mental and activity traits. 


1, 





REFERENCES 

ApAMS, H. F. ‘‘Objectivity-Subjectivity Ratio.’’ Jr. Soc. Psychol., 
1930, 1, 122-134. 

Capy, V. M. ‘‘The Estimation of Juvenile Incorrigibility.’’ J. 
Deling. Monog., No. 2, 1923. 

CHAMBERS, O. R. ‘‘Character Trait Tests and the Prognosis of 
College Achievement.’’ J. Abn. and Soc. Psychol., 1925, 20, 
303-311. 

HARTSHORNE, H. and May, M. ‘Studies in Deceit,’’ The Mac- 
millan Company, 1928. 

HARTSHORNE, H. and May, M. A. ‘‘ Recent Improvements in De- 
vices for Rating Character.’’ J. Soc. Psychol., 1, 66—76. 
Hvucues, H. W. ‘‘ Relation of Intelligence to Trait Characteristics. ’’ 

J. Educ. Psychol., 1926, 17, 482-494. 

May, M. A. and HartsHorne, H. ‘‘ Personality and Character 
Tests.’’ Psychol. Bull., 1926, 23, 395-411. 

May, M. A., HarTsHoRNE, H., and WeLty, R. E. ‘‘ Personality 
and Character Tests.’’ Psychol. Bull., 1927, 24, 418-435. 
May, M. A., HarTsHORNE, H., and WeELTy, R. E. ‘‘ Personality 
and Character Tests.’’ Psychol. Bull., 1928, 25, 422-443, 
May, M. A., HARTSHORNE, H., and We.ty, R. E. ‘‘ Personality 
and Character Tests.’’ Psychol. Bull., 1929, 26, 418-444. 
McDonougE, Sister M. Rosa. ‘‘Empirical Study of Character.’’ 

Studies in Psychology and Psychiatry, 1929, 2, 1-144. 

SLawson, JOHN. ‘‘The Reliability of Judgments of Personal 
Traits.’’ Unpublished Master’s Thesis, Department of Psy- 
chology, Columbia University. 

WELLS, F. L. ‘‘Mental Tests in Clinical Practice.’’ World Book 


Company, 1927, p. 275-276. 








THE DEVELOPMENT OF A NEW METHOD FOR 
DETERMINING THE RELATIVE EFFI- 
CIENCIES OF ADVERTISEMENTS 
IN MAGAZINES 


KEY LEE BARKLEY 
University of North Carolina 


INTRODUCTION 


The methods used in experiments in psychology of adver- 
tising to determine the efficiency of magazine advertisements 
have customarily involved the use of the number of recalls or 
the order of recalls as the determining factor when the recall 
method of securing the data has been employed. No research 
has come to the attention of the writer in which both of these 
factors have been given consideration in determining the effi- 
ciencies of the advertisements. It has, however, been gen- 
rally recognized that the efficiency of the advertisements is 
indicated by both the frequency of recalls and by the order 
of recall. The methods which have given consideration to 
only one factor are to be considered inadequate. The per- 
centage of recall method gives weight to the number of 
recalls, but does not emphasize the special value which should 
be given to the order of recall. The average rank method 
gives significance to the rank of recall, but does not emphasize 
the factor of number of recalls. Therefore, by use of the 
percentage of recall method, an advertisement might be shown 
to have a high efficiency when the recalls were secured in the 
higher ranks and thus no consideration be given to the factor 
of order of recall. By use of the average rank of recall an 
advertisement with only a few recalls might possibly have a 
low average rank and therefore be considered to have a high 
efficiency if proper consideration were not given to the factor 
of number of recalls. There has been, consequently, a desire 
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on the part of those interested in research in psychology of 
advertising for some method of determining the efficiencies 
of advertisements which would give weight to both these fac- 
tors simultaneously. The work which is reported in this 
paper grew out of this demand for a new method. 

Since the formula developed in this work does give weight 
to these two factors of number of recalls and the order of 
recalls, it is the belief of the experimenters that use of the 
formula gives a more adequate indication of the actual effi- 
ciency of the advertisement than either of the two traditional 
methods. The formula is intended for use in experiments in 
which the recall method of securing the data is used and in 
which the experimenter identifies and tabulates the advertise- 
ments in the order of their recall. If the recalls are tabulated 
on large sheets of distribution paper with horizontal columns 
representing the advertisements and vertical columns the order 
of recalls, the elements of the formula can be worked out quite 
easily. The only thing which remains to be done in calculat- 
ing the index of efficiency of any given advertisement is to 
substitute in the formula and proceed to solve it. 





HISTORY AND EVOLUTION OF THE FORMULA WHICH IS PRESENTED 
AS THE NEW METHOD IN THIS WORK 








The chief emphasis in this paper will be upon the formula 
which is the heart of the new procedure for determining the 
efficiencies of advertisements in magazines. The significance 
of the method depends upon the adequacy of the formula 
developed. The method of application of the final statement 
of the formula and the difference between it and the earlier 
statements perhaps will be shown best by tracing the develop- 
ment of the formula through its various stages. 

Dr. Harry W. Crane, Professor of Psychology at the Uni- 
versity of North Carolina, recognized the need of having some 
method which would give consideration to both the number of 
recalls and the rank or the order of recall in determining the 
efficiencies of advertisements in magazines. He secured the 
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aid of Dr. A. P. Weiss, of Ohio State University, in working 
on the problem. Dr. Weiss formulated and suggested the 
original statement of the formula. It was written thus 


Sum of the Ranks 
(Number of Recalls )?~ Efficiency 





(To get the ‘‘Sum of the Ranks,’’ see footnote below.) 


The probable derivation of this formula is as follows: 
Let 


Average Rank - Ee) 
Number of Recalls ~ Efficiency 





Now 


_ Sum of the Ranks 
Average Rank =Niber of Recalls 


Substituting in the formula we get 





Sum of the Ranks 
Number of Recalls = Efficiency 
Number of Recalls 


Resolving the formula we get 


Sum of the Ranks ow 
(Number of Recalls)*~ Efficiency 


and this is the formula which Dr. Weiss proposed. 











SOURCE OF THE ELEMENTS IN THE FORMULA 


Some explanation is needed for the use of the factors of 
the formula in the relations in which they appear. As has 
already been stated, the number of recalls and the ranks of 
the recalls are the factors which indicate the efficiency of an 
advertisement. The efficiency of an advertisement would, as 
a consequence, vary as the factors of number of recalls and 
order of recalls vary. Since this concomitant variation is an 


1To get the sum of the ranks, multiply the rank of recall by the num- 
ber of recalls under that rank and take the sum of the products. Thus 
suppose that a given advertisement were recalled three times at rank 
three, twice at rank six and once at rank nine. Then the sum of the 
ranks would be as follows: 
(3x3) + (2x6) + (1x9) =9+12+9=30 
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evident fact, then the experimenter should secure some means 
of showing the effects of variation of one or of both factors 
on the resulting index of efficiency of an advertisement. 

A glance at the formula in its final expression, and in any 
of the stages through which it has come to its present state- 
ment, will reveal that by means of it the experimenter does 
secure a change in the index of efficiency when there is a 
change in either or both of the significant factors. There are 
other algebraic relationships in which these factors could be 
treated and which would still show the desired variation in 
the index of efficiency. However, by the methods other than 
the one used, the indices secured would range over a wide 
territory and be so unwieldy as to make them difficult to 
manipulate. The relationship in which the factors are used 
in this work is considered to be the most facile of any. Fur- 
thermore, the final statement of the formula gives indices 
which range within narrow limits, but which may, neverthe- 
less, be given significant meaning in the interpretations. 


CRITICISM OF THE ORIGINAL FORMULA 


It will be noted that the original formula stated by Pro- 
fessor Weiss accomplishes the purpose of giving consideration 
to both the factors mentioned above as the ones which indicate 
the efficiency of an advertisement, namely, the number of 
recalls and the order of recalls. Upon practical application 
of the formula, however, it was found that it indicated an 
increase in efficiency by a decrease in the size of the numerical 
index of efficiency. For example, if two advertisements were 
recalled the same number of times, but at different average 
ranks, the one with the lower average rank (and this one 
would be considered to have the higher efficiency) would be 
given a smaller index than the less efficient one, as shown 
below. 

Example I— 

Suppose an advertisement to be recalled ten times and in 

such an order as to give 28 as the sum of the ranks. 
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Then Number of Recalls=10 
Sum of the Ranks=28 


Substituting in the formula we get 


Effic; _ Sum of the Ranks 28 28 Clg 
ae (Number of Recalls)? ~(10)*~ 100 = 56 


















Example II— 

Suppose another advertisement to be recalled the same num- 
ber of times as the one above, but in such an order that the 
sum of the ranks is 23 instead of 28. 


Then Number of Recalls =10 
Sum of the Ranks=23 
Substituting in the formula we get 


Effici _ Sum of the Ranks 23 ee 
creney=(Number of Recalls)? ~ (10)? ~ 100 ~ 2S 





This method of indicating greater efficiency by a smaller 
numerical index was found to be confusing and because of this 
difficulty the original formula was considered inadequate. It 
should be noted that except for rare cases the index of effi- 
ciency would be in the form of a fraction. 

In order to overcome the difficulty of indicating increased 
efficiency by a decreased numerical index, Dr. Crane suggested 
that the order of the factors in the formula be reversed, that 
is, the positions as numerator and denominator be reversed. 
This reversal gave the formula 


(Number of Recalls)’ 
Sum of the Ranks 





= Efficiency 


This change did remove the difficulty as to the form of the 
resulting index of efficiency, but the formula was still inade- 
quate for use in determining the indices of efficiency of 
advertisements by groups of subjects which varied in size. In 
other words, the formula did not give the same index of effi- 
ciency to advertisements which had the same percent of recall 
and the same average rank if different numbers of subjects 
were used in the experiments. 
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In order to demonstrate that a difference in the number of 
subjects used would bring about a marked difference in the 
indices of efficiency as determined by the above statement of 
the formula, three examples were postulated in which adver- 
tisements were recalled the same percentage of possible times 
and at the same average ranks. The size of the groups was 
varied, however, so that the effect of the size of the group 
could be studied. 

Advertisement number 1— 
Fifty subjects out of a group of one hundred were sup- 








%, posed to have recalled this advertisement which would give a 
} percentage of recall of fifty. The recalls were distributed in 
g the following manner: 
4 Sum R 
b 10 in the first place 10 
% 10 in the fifth place 50 
i 10 in the sixth place 60 
5 10 in the tenth place 100 
: 10 in the twelfth place 120 
; Total Sum of the Ranks 340 
: Average Rank 6.8 
Substituting in the formula 
FF ~@my me 
b= Sum B~ 840 — 340 ~ ‘25204 
4 Advertisement number 2— 


Twenty-five subjects out of a group of fifty were supposed 
to have recalled this advertisement which would give a per- 
centage of recall of fifty. The recalls were distributed in the 
following manner: 


Babes wy 


EDR > 6 


q Sum R 

4 5 im the first place 5 

4 5 in the fifth place 25 

q 5 in the sixth place 30 

ig 5 in the tenth place 50 

i 5 in the twelfth place : 60 

: Total Sum of the Ranks ae 170 
Average Rank 6.8 
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Substituting in the formula 

el, ——/ al AS 
Advertisement number 3— 

Ten subjects out of a group of twenty were supposed to have 
recalled this advertisement which would give a pereentage of 
recall of fifty. The recalls were distributed in the following 
manner : 





Sum R 
EI TSE a RCC 
ELLE BIS ee SS RY 
2 in the sixth place .................... powntbialesatinilippeclglesbiai 12 
EE 
ee IO, cicinicctiiesenicteteeninin ae ceatiteterneneovnnsiptate . 24 
Total Sum of the Ramks 2. ccescccssssssssssonseseee 68 

genre tee mere Lon ere ee 6.8 


Substituting in the formula 


__N*__ (10)? 100 * 
=SumR> 68 ~ 68 = 1-47059 
These examples indicate that the index of efficiency of an 
advertisement as determined by use of the formula 


N? 
Sum R~ E 


would be very different when determined by using different 
sized groups of subjects even though it had the same average 
rank and the same percent of recall in each case. In fact, 
when the average rank and the percent of recall remained the 
same, the index of efficiency would vary directly as the size 
of the group varied. The examples show the truth of the last 
statement: When, under the conditions as noted, the number 
of subjects is cut in half, the index of efficiency is found to be 
just half of what it was formerly, and when the number of 
subjects is only one fifth as great as in the first instanee, the 
index of efficiency is also shown to be one fifth as great. 
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The question might be raised at this point as to whether or 
not the efficiency of an advertisement should not be shown to 
be greater when a group of one hundred subjects is used than 
when a group of fifty subjects is used, provided, of course, 
that in each ease it has the same average rank and the same 
percentage of recall. In other words, should not the efficiency 
of an advertisement be shown to increase in some proportion 
as the size of the group of subjects increases, provided it still 
has the same average rank and the same percentage of recall? 

It has been made clear above that the efficiency of an ad- 
vertisement is indicated by the average rank and by the num- 
ber of times it is recalled. The size of the group of subjects 
should make no difference in the index of efficiency. The 
criticism which could be offered here is one regarding the 
reliability of the index as determined by use of smaller and 
larger groups of subjects. The factors of chance enter to 
give greater reliability to the index when it is determined by 
a large group than when it is determined by a small group. 
This greater reliability of the index determined by a larger 
group of subjects, however, is not to be translated into terms 
of efficiency. Judgment of the reliability of the index as 
dependent upon the number of subjects used is an inde- 
pendent consideration, and does not involve the factors which 
indicate the efficiency of the advertisements. 

The examples given above indicate further that the formula 
as stated could be used to determine the relative efficiency of 
advertisements only when one particular group of subjects is 
used, or when exactly the same sized groups of subjects are 
used. Under these conditions the index of efficiency of one 
advertisement would be legitimately comparable to the indices 
of other advertisements studied by the same group or groups. 
Furthermore, it should be added, the average indices of 
groups of advertisements which have been determined by 
using different sized groups of subjects legitimately may be 
compared provided the experimenter employs the cumbersome 
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and faulty procedure of taking an equal number of advertise- 
ments of each class from the results obtained from each group 
of subjects. It is very improbable, however, that such a 
procedure would be feasible or possible. Hence, some method 
is needed which will make the results from all groups, regard- 
less of size, comparable. 

To meet the above mentioned need a new statement of the 
formula has been made which contains a constant that removes 
the objection to the formula discussed above. The new state- 
ment of the formula is written as follows: 


N?> 100 F 
Sum R* ma Efficiency 





N= Number of Recalls 
Sum R= Sum of the Ranks 
gn=Group Number 
100 = Standard Number of Subjects 
E = Efficiency 
By use of this formula the results from all groups will be in 
the same terms, and if the average rank and the percent of 
recall are the same for any two advertisements, then the 
indices of efficiency as determined by the formula will be 
equal, regardless of the size of the group of subjects used in 
the experiments. For example we may use the three illus- 
trations given above to show that the indices of efficiency of 
advertisements varied directly as the size of the group of 
N? 
Sum R 
ing the new formula to these examples we = ve the following: 
Example number 1 


subjects when we employed the formula =E. Apply- 


N=50 
Sum R= 340 
gn=100 
_ (50)? 100 2500 100 2500 
— 340 *100= 340 *100= 340 — 





~ 


35294 


Example number 2 
N = 25 
Sum R=170 
gn=50 
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wp - £25)" 200 _ OS 200 
=T70 “50 =170* 50 - 


Example number 3 


N=10 
Sum R= 68 
gn= 20 
(10)? 100 (100)5 _— 
E= “63 “30> 68 - 7.35294 





The immediately preceding illustrations are calculated to 
show that the new formula is an adequate means of deter- 
mining the efficiencies of advertisements and that it is not 
subject to the criticisms which were mentioned as valid with 
regard to the other statements of the formula. It should be 
emphasized again that the new formula renders all results 
from all groups, regardless of size, in such form that they are 
truly comparable. This end is realized by referring all 
efficiencies to a standard group of subjects and tabulating 
them in terms of what they would have been if the standard 
group had been used, and provided that their average rank 
and percentage of recall had remained the same. 

The formula was designed to give weight to the two factors 
of percent of recall and rank of recall. The aim is to give 
weight to both factors so that if the advertisement gains 
efficiency by reason of many recalls, or by reason of early 
recalls or by both, the formula will bring out the single or 
joint effect, and show it in a single numerical expression. 
Furthermore, it should be noted that the formula negates the 
false value which might be given to a large number of recalls 
coming late in the series, and also prevents a false impression 
from being given by a low average rank secured by a few 
early recalls. In this connection it is well to mention that in 
some cases the indications of efficiency by the percent of recall 
and by the average rank may vary widely. The formula 
reconciles these differences and gives the proper indication as 
determined by the joint effect of the two factors. 
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It now remains to be shown that the new formula, as it has 
been devised and stated above, is better than some other state- 
ment of the same formula; that it is better than some other 
statement which shows the same relationships, but which 
necessitates some other procedure than the one outlined above 
for solution. That it can be reduced to a different statement 
will be brought out by the following analysis: 





N? 4100 ~=N(N) 100 
Sum R* gn~N(Ave.R)* "N(i00) <=E 
% of Recalls 
because 
N?=N(N) 
Sum R= N (Ave. R) 
N(100 
Percent of Recall = ac, Therefore 
N(100) 


&2= &% of Recall 


Substituting in the formula we have 





N(N) 100 
N(Ave. Rank) * N(100) =E 
% of Recall 


Percent of Recall _ 
Average Rank 





Resolving the formula we get 


It should be noted that the formula is reducible to this last 
statement because of the fact that all efficiencies are reduced 
by the new formula to what they would have been by the old 
formula if a group of one hundred subjects had been used. In 
other words, the efficiencies of the advertisements are found 
by either statement of the new formula are the same as they 


y2 


N ) 
Sum Rk formula with 


would have been if determined by the 
Sum 


one hundred subjects. 

It will be of interest to note the mathematical performances 
which must be executed to utilize the two statements of the 
new formula. The performances necessary to utilize the 
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N? 100 
x 


stat t — 
statement = oR a= 





= E are as follows: 


1. Find the number of times the advertisement has 
been recalled. 


2. Square the number of recalls—the squares may 
be written down from prepared tables. 
3. Find the sum of the ranks. 
> 
4. Establish the standard _~ indicated. 
5. Subst.tute and solve the formula. 


The performances necessary to utilize the statement 
Percent of Recall 


Average Rank 








= E are as follows: 


1. Find the number of times the advertisement has 
been recalled. 

2. Divide the number of recalls by the number in 

the group to get the percent of recall. 

Find the sum of the ranks. 

Divide the sum of the ranks by the number of 

recalls to find the average rank. 

5. Substitute and solve the formula. 


On the whole it is easier and quicker to use the statement 

N? 100 
Sum R * gn’ 
performance number four may be done once and the result 
obtained may be employed in all problems in the group, 
whereas by use of the other statement of the formula part 
four would have to be done anew for each problem. When 
using the above suggested statement of the formula, per- 
formance number two can be done more accurately, quickly 
and easily than part number two of the second method, 
because of the possibility of using simple prepared tables. 
The other parts of the operation are about equal in complexity 
and difficulty. The writer would recommend the statement 
of the formula given above in this paragraph, because it is 
easier to arrive at the elements to be substituted, because the 


When this statement of the formula is used, 
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operations may be more rapidly executed, and finally, because 
it offers less opportunity for error. 

The only advantage the writer can see in the simplified 
formula is that it brings out more clearly how very significant 
changes in either percent of recall or average rank really are. 
It will be noted that the efficiency of an advertisement may 
be increased by either increasing the percent of recalls or by 
decreasing the average rank since the formula operates as if 
Percent of Recall 

Average Rank " 

In connection with the statements made in the last para- 
graph, some discussion should be given at this point of a 
special or unexpected phenomenon connected with the appli- 
cation of the formula. 

A serutiny of the formula will reveal that the indices of 
efficiency determined by it vary directly as the number of 
recalls increases and inversely as the rank of recalls increases. 

Thus suppose we have the following conditions from which 
to work out the efficiency of an advertisement: 





it were written 


EE ee eee 7 8 9 
Number Of Recalls ccccccccccccccccccccrsssncvnnae 2 2 4 
Total Musser ef Messe: oc... 8 
| |. a — 
Drees ar GR nes 108 


Substituting in the formula— 


N? i00 64 100 64 


E=Gum R* gn ~ 66 * 100 ~ 66 = 90% 


Then doubling the recalls we have the following: 


ES Se LO | 8 9 
Number of Recalls n..ccccccccscsssssssvcvececseseee 4 4 8 
Total Number of Recalls .......ccccccccccccccccsssscsssssse ee 16 
I SEE a 


RD | 
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Substituting in the formula— 
en 
Sum R” gn 132° 1007 132 

Doubling the number of recalls doubled the index of efficiency 
when the ranks remained the same. A like result is obtained 
when the ranks are divided in half and the number of recalls 
remains the same. The converse of these two statements can 
also be demonstrated in the same way. 

The results given in the above paragraphs would tend to 
show that an advertisement would have an index of efficiency 
directly proportional to the number of times it is recalled; 
that is to say, we would conclude on a theoretical basis that 
chance factors would operate to secure such a distribution of 
the recalls that the resulting indices of efficiency of the adver- 
tisements would vary directly and in the same proportion as 
the number of recalls. Such a conclusion is not justified as 
is shown by the facts in the following paragraphs. 

A correlation technique was applied to discover the amount 
of agreement between the average ranks and the percentage 
of recall of the advertisements in one of the magazines used in 
the experiments to test the formula. The Pearson Product 
Moment method was employed, and a negative correlation of 
.614 was found between these two factors. The correlation 
of —.614 is regarded as indicating a marked tendency for the 
advertisements which have been recalled a large number of 
times to be of proportionately lower average rank than an 
advertisement which is recalled a smaller number of times. 
The advertisement with the larger percentage of recall would, 
therefore, tend to have an index of effiviency larger than its 
percentage of recall would indicate when considered in rela- 
tion to another advertisement which has a smaller percentage 
of recall. Thus if there were an advertisement which were 
recalled forty times and another which were recalled ten 
times, the correlation shows that the one with ten recalls 
would not have an index of efficiency one fourth as great as 
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that of the one with forty recalls, since it would not get as 
much benefit from the tendency for the recalls to be con- 
centrated towards the lower ranks. In other words, the 
advertisement with the forty recalls would have a lower 
average rank than the advertisement with ten recalls. The 
significance of the lower average rank will be seen when it 
is remembered that the formula may be written as follows: 
Percent of Recalls _ E 

Average Rank — 





INFLUENCE OF THE SIZE OF THE ADVERTISING SECTION 


It is a known fact that the attention and memory value of 
advertisements vary in some as yet not definitely known 
degree as the size of the advertising section varies. In order 
to render the results secured in one work comparable to the 
results obtained by using some other medium with a smaller 
or larger number of advertisements in it, the indices of effi- 
ciency as given for the various advertisements would have to 
be translated into the terms of the other group with which 
they are compared, or the reverse, or both may be referred 
to a standard measure. It would be better to establish some 
standard form or number to which all could be referred, 
because then all results would be comparable in at least this 
one respect. The formula which we have proposed in this 
work might be written finally in such a way as to give results 
in terms of a standard group of subjects and in terms of a 
standard group of advertisements. Such a formula would 
have to take into consideration the effect of variation of size 
of the advertising section both upon the number of recalls 
and the order of recalls. 


100 
Sum R * ‘gn 
gives results in terms of a standard group of subjects, but the 
factor of difference in size of the advertising section is not 
considered. That this formula can be modified to give results 
in terms of a standard group of subjects and in terms of a 


The formula so far developed in this work, =K, 
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standard group of advertisements will be made clear by the 
following discussions. 
, ; ; 100 

Instead of using the simple element as given above as —— 

gn 
which translates the results into terms of a standard group 
of subjects, we may use a ‘‘Group Constant’’ which takes 
into account the size of the advertising section as well. The 
purpose of the next few paragraphs is to give the derivation 
of the ‘‘Group Constant’’ which will henceforth be referred 
to as the GK. 

The development of the GK in this paper will have to 
depend upon information from other experimenters’ work, 
because the present experimenters have not as yet conducted 
any experiments on the particular topic involved. Professor 
E. K. Strong conducted an investigation to determine how an 
advertisement varies in attention and memory value with 
increase in size of the advertising section. The value of this 
work for our purpose is limited, because he studied only full 
pages. The results are reported by Starch in summary form.’ 
The results from a similar experiment by Thomas C. Blanch- 
ard and Carl J. Warden* agree in essential parts with the 
results of Strong’s experiments. The per cent of recall was 
used in Strong’s experiments to indicate the efficiencies of the 
advertisements. His results show that the decrease in per- 
cent of recall value of each advertisement is 24.2 when the 
advertising section is increased from ten advertisements to 
fifty. That is to say, if the attention and memory value of 
each advertisement in the advertising section which comprised 
only ten advertisements were 81.4 percent, then the attention 
and memory value of each advertisement in a group of fifty 
would be 57.2 percent. This decrease in recall value gives 
an average decrease of three percent for each five advertise- 
ments added to the advertising section above ten and up to 


2 Principles of Advertising, Starch, Daniel, pp. 796ff. 
8 JOURNAL OF APPLIED PsycHoLoey, Vol. 10, 1926, pp. 162ff. 
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fifty. The decrease in recall value resulting from increasing 
the advertising section from fifty to one hundred fifty is 22 
percent. This decrease gives an average decrease of 2 per- 
cent for each ten advertisements added to the advertising 
section above fifty and up to one hundred fifty. 

That these average decreases for the various groups added 
to the advertising section are not absolutely the correct ones 
for each group is recognized. The decrease in value of each 
advertisement upon the addition of the first ten advertise- 
ments to an advertising section of fifty may be different from 
the decrease for the next ten added. It does not appear prac- 
tical, however, to have an index for each advertisement added 
to the section, or for groups smaller than those suggested. If 
indices for smaller groups were used, numerous tedious frac- 
tions would likely enter the calculations and make no real 
difference in the end results. When the experimental work 
is more complete it will be well to work out tables of variation 
which will eliminate the necessity for computing the variation 
for each group. It should also be indicated again at this 
point that the results obtained by Strong are not adequate to 
use in determining the GK in the case of advertisements of 
different sizes, since he used only full page advertisements in 
his study. 

In order to show how the ‘‘Group Constant’’ in the for- 
mula, which we propose as the final one, may be derived, we 
shall use Strong’s results as a basis of study and for illus- 
tration. We do this, however, with full recognition of the 
inadequacy of Strong’s results as a basis for determining the 
correct GK. 


Let us take a group of 100 advertisements as our standard 
size advertising section. Then our purpose would be to find 
a method of converting the indices of all advertisements into 
what they would have been if they had been observed in an 
advertising section made up of 100 advertisements. This 
conversion is accomplished through the GK in our formula. 
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To find the GK for a group of advertisements numbering 
between 100 and 150, we would use the following relation- 
ships— 
100 _ 02 
ae [1+ (na-100 x 02)) _ ox 
gn 10 

Where 


gn=Group number (number of subjects) 
na=Number of advertisements in the section 
GK =Group Constant. 


To find the GK for a group of advertisements numbering 
less than 100 and more than 50, we would use the following 
system of relationships— 

100 | [1 - (100-na x .02)] _ 


gn 10 GE 


To find the GK for a group of advertisements numbering 
more than 10 and less than 50, the following system of rela- 
tionships would be used— 

100 5 1-[ (100-50 x .02) + (50-na x .03)] 
gn a . i 5 

If the advertising section contains 100 advertisements the 
GK will be the same as if determined by the first element in 
the equation since it would be just multiplied by one. If 
there are more than 100 advertisements in the section, the GK 
will be increased accordingly, and if there are less than 100 
advertisements in the section, the GK will be decreased ac- 
cordingly. 

As stated above, the variations which are used to indicate 
how the GK may be found for different sized advertising sec- 
tions are based on the results obtained by Strong. The in- 
tentions of the writer are to determine by later researches the 
percentages of increase or decrease in value of advertisements 
of different sizes brought about by a change in the size of the 
advertising section. He plans to use the general method out- 
lined in this paper. 
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The final statement of the formula which renders all results 
in terms of a standard group of subjects and in terms of a 
standard group of advertisements, and which takes into ac- 
count the two factors of number of recalls and order of re- 
calls, would read as follows: 


N? 
Sum kX GK=2 


where 


N=Number of Recalls 

Sum R=Sum of the Ranks 

GK = Group Constant 

E = Efficiency. 
The GK would have to be determined for each group of adver- 
tisements and for each group of subjects in the manner indi- 
cated above. All the illustrations used above to show the 
application of the former statements of the formula could be 
used to illustrate the use of this final statement of the formula. 
The only difference would be in the end results; the indices of 
efficiency would be changed according to the variations in the 
size of the advertising sections. 


WHAT THE FORMULA MEASURES 


In presenting an advertisement, the first aims of the adver- 
tiser are to attract the attention of the potential observers and 
cause them to get the message of the advertisement either by 
reading the printed materials or by interpreting the illustra- 
tions. Reactions to the advertisement in terms of business 
results will then depend upon the interests, needs and pur- 
chasing capacities of the observers. These interests and needs 
may be created by the contents of the advertisements or they 
may already exist. In the latter case the advertisement would 
serve only as a guide which turns the behavior of the indi- 
vidual in a given direction. In the former case it may avt 
as a definitely conditioning factor to initiate new forms of 
behavior. The advertisement can be of no immediate business 
value unless the observer has the necessary purchasing power. 
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The actual, concrete business efficiency of an advertisement 
can be measured, according to our present knowledge, by 
counting the number of purchases which the advertisement 
actually induced the observers to make. This method, how- 
ever, is quite impracticable in most cases, because it involves 
the actual use of expensive advertising, the cooperation of 
many generally untrained workers in checking the purchases, 
and must be extended over long periods of time—the adver- 
tisements may result in purchases for many months. 

The method which is developed in this work does not 
measure the actual business efficiency of the advertisements. 
It indicates the power of the advertisement to attract the at- 
tention of the observer and to cause him to react to the pre- 
sented material in such a way that he is later able to recall 
and to identify the particular advertisement. The index of 
this power may be taken as a measure of the relative value of 
the various advertisements. Whether or not the observer will 
react to the advertisement by purchasing the goods advertised 
will depend upon factors other than those we propose to 
measure by the methods described in this work. The factors 
of the needs, interests and purchasing power of the observers, 
and the new behavior tendencies induced by the advertise- 
ments, must be measured by some other method. 

It should be noted at this point that there are other factors 
than the actual advertising material as it is presented in any 
given advertisement which give the advertisement recall 
value. The training effects of other advertisements of the 
same or similar types, presence of other advertisements of the 
same kind in the medium, and the interests of the observers 
should be listed among these factors. The results obtained 
by use of the formula developed in this work should be con- 
sidered in the light of the discussion given in this paragraph. 


CONCLUDING REMARK 


The value and adequateness of the formula and the method 
discussed in this paper have been demonstrated and tested by 
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making a study with a current magazine and with a large 
number of subjects. It is hoped that the full procedure and 
the results of this study and demonstration of the new method 
will be published in a later article. 
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REPEATERS AT THE COLLEGE LEVEL 


J. A. CEDERSTROM 
University of Minnesota 


With the development of educational scales for measuring 
range of information in zoology it is now feasible to compare 
the gains made by different groups of students working under 
the same conditions with a reasonable degree of accuracy. In 
many college classes a small group of repeaters periodically 
appears. Some have taken only a part of the course at some 
previous time; others have taken all of it without having 
made the desired grade. Among the several classes taking 
the introductory course in zoology at the University of Min- 
nesota during 1926-27, there were a sufficient number of 
repeaters to warrant a comparison of their gains with those 
made by the students beginning the course the first time who 
will be designated regular students to differentiate them from 
the repeaters.* 

The measurements of initial attainment were made early 
in the course with two ranges of information scales in zoology 
that were developed by the author in collaboration with Dr. 
M. J. Van Wagenen of the Department of Educational Psy- 
chology. On these scales the scores which are independent 
of the particular test used are expressed in terms of a unit of 
measurement. This feature not only makes it feasible to 
measure gains in an intelligent manner but also to compare 
them and handle them statistically. 

Among the five courses the one beginning at the end of the 
first quarter, designated as Academic 1 W, and continuing 

1 The measurements from which the results in this study were derived 


were made with the cooperation and assistance of the several members 
of the Department of Zoology at the University of Minnesota. 
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throughout the rest of the year with four hours of lecture and 
six hours of laboratory work, contains the larger number of 
repeaters, including many of those who failed in the four 
classes starting at the beginning of the year. Of the original 
enrollment of 174 in the Academic 1 W, twenty-eight were 
repeaters. Six of these, however, registered late, missing the 
initial test. They also dropped out before the second mea- 
surements were made; hence, for these, there are no measures. 
For seven more, who dropped out at the end of the first 
quarter, only the initial and second measurements are avail- 
able. For the remaining fifteen, the records are complete with 
one exception. 

Average scores in the initial tests for the repeaters who 
completed the course were higher than the average scores for 
the rest of the class. The repeaters’ average score was 81.36 
on Seale A against 75.66 for the regular students and 80.86 
on Seale B against 76.25 for the regular students. The re- 
peaters did not maintain this lead over the regular students 
in later tests, however ; for the entire course the repeaters had 
an average gain on Scale A of 15.07 scale points against 20.73 
for the regular students and an average gain on Scale B of 
16.78 seale points against 21.14. 

Mention was made above that seven repeaters did not com- 
plete the work, dropping out at the middle of the course. 
Had the scores of these seven repeaters been included when 
making the average scores of the repeaters referred to in the 
previous paragraph, the average would have been reduced 
three scale points on Seale A and by five scale points on Scale 
B. Caleulating the average gains for these seven repeaters 
not completing the entire course we find said average gains 
to be 5.43 against 7.21 on Secale A and 5.71 against 9.78 on 
Scale B. Apparently these seven students were much weaker 
students than the other repeaters we have been comparing 
them with, as indicated above. 
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TABLE I 
Mean Scores for the Three Groups of Students in Zoology on Information 
Scales A and B 





INITIAL SECOND FINAL 
MEASUREMENT MEASUREMENT MEASUREMENT 


GROUP NO. 








Scale A | Scale B | Scale A | ScaleB| ScaleA | ScaleB 


Repeaters com-| | 
pleting only 
first half of 


course 








80.14 79.29 85.57 85. 


~ 


Repeaters ¢om- 
pleting entire 
course 14 81.36 80.86 88.57 90.64 93.43 97.64 

Regular students 
completing en- 
tire course 





~~ 
~ 
or 
lor) 
or) 


| 




















76.53 87.53 87.49 96.39 97.39 





Attention has been called to the fact that initial scores of 
repeaters are higher than for those who are taking the course 
for the first time. The final scores of repeaters are approxi- 
mately the same as for the regular students; hence the mean 
gains of the repeaters fall below the mean gains made by 
regular students during the course. 

Academic standards would hardly be considered too rigid 
if the repeaters were required to make as great a gain as the 
mean gain made by those who are taking the subject for the 
first time. Since, with the added advantage of taking the 
course a second time, the best repeaters barely score on a par 
with the regular student one is led to question the wisdom of 
allowing students who have failed in a previous class to take 
the course a second time for credit. 

For each of the four classes starting at the beginning of 
the year the number of repeaters was considerably smaller 

: but in all classes but one, that of the Predental students, the 
same tendency for the repeaters to start with higher initial 
scores and finish with no higher or even lower final scores, if 
they finish the course at all, is apparent from the data in the 
tables below. 
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TABLE II 


Mean Scores for the Group of Students in Zoology on Information Scales 
A and B for the Class Known as Section I 











INITIAL SECOND FINAL 
pa i MEASUREMENT MEASUREMENT MEASUREMENT 
Scale A | Scale B | Scale A | ScaleB| ScaleA| Scale B 
Repeaters ¢com- 
pleting entire 
CRIB recension 5 85.8 83.6 90.2 90.6 101.4 99. 
Regular students 
completing en- 
tire course ........ 100 85.05 86.63 89.94 90.89 | 102.61 | 102.88 


























With an enrollment of 145, Section II boasts of only two 


repeaters. 


Both dropped the course after completing the 


first quarter’s work with inferior records as compared with 
the records of the regular students in the class. 


TABLE Ill 
Mean Scores for a Group of Students in Zoology on Information Scales 
A and B for the Class Known as the Premedics 

















INITIAL SECOND FINAL 
a ties MEASUREMENT MEASUREMENT MEASUREMENT 
Scale A | Scale B | Scale A | Scale B | ScaleA | Scale B 
Repeaters ¢om- 
pleting entire 
ee 5 91.4 87.4 98.8 96. 98.8 101.6 
Regular students 
completing en- 
tire course ........ 76 82.99 83.59 | 91.43 | 90.C. 01.21 | 102.7 























In the case of two of the predental students, their credit in 
zoology was essential for admission to the School of Dentistry. 
This requirement may have been an added incentive for a 
more rigid application to the work of the repeaters considered 
in Table IV. 

On basis of native ability as determined by scores on the 
College Aptitude Tests given entering freshmen at University 
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TABLE IV 
Mean Scores for a Group of Students in Zoology on Information Scales 
A and B for the Class Known as the Predents 





INITIAL SECOND FINAL 
MEASUREMENT MEASUREMENT MEASUREMENT 
GROUP NO. 





Scale A | Scale B | Scale A | Scale B | Scale A Scale B 





Repeaters com - 
pleting entire 
course .. 3 86. 81.33 | 90. 87. 104.67 97. 

Regular students 

completing en- 

tire course 71 82.77 82.69 88.87 89.49 97.75 97.39 





























of Minnesota, the repeaters appear to be distributed in each 
quartile of such a percentile distribution. Listing repeaters 
who completed a definite unit of the course in zoology during 
1926-27, we have Table V indicating the percentile ratings 
where these were available. 


TABLE V 
Quartile Distribution of Repeaters on Basis of Percentile Ratings on 
Scores in the College Aptitude Tests given at the 
University of Minnesota 








THOSE THOSE — 
COMPLETING COMPLETING WHO DROPPED 
QUARTILE ENTIRE FIRST HALF THE COURSE TOTAL 
COURSE OF COURSE WESHOUT ANT 
MEASURES 
Highest 3 0 3 6 
Second 5 0 0 5 
Third 4 3 1 8 
Lowest 9 5 2 16 
Percentile not 
given 7 5 0 12 
Totals 28 13 6 47 











It would seem as if there are elements, other than native 
ability as measured by the College Aptitude Tests, that play 
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an important part in determining gains or achievement in 
any course. 

While it is unfortunate that percentile ratings are not 
available for more than three-fourths of our group, it is appar- 
ent, however, that repeaters are not restricted to any single 
quartile. Of those for whom percentile ratings are available 
two-thirds were in the two lower quartiles and the remaining 
one-third were about equally distributed in the two upper quar- 
tiles of this distribution. If mean gains made by repeaters 
completing the course are calculated for each quartile it ap- 
pears the differences in such means are not as marked as one 
might expect. The highest mean gain on Scale A was made 
by the repeaters in the lowest quartile while the lowest mean 
gain on the same scale was made by those in the highest quar- 
tile. On Seale B the highest mean gain was made by those in 
the second quartile and the lowest by those in the third quar- 
tile. The three outstanding gains were made by repeaters 
whose percentile ratings placed each one in the lowest quartile. 
Next to the lowest mean gain on the two Scales A and B was 
made by the repeater having next to the highest percentile 
rating of the twenty-eight repeaters completing the entire ; 
course. We might also add regarding this particular case | 
that the final score for this repeater on both Scales A and B 
places him near the bottom of the upper quartile distribution 
of scores at the end of the course. The lowest mean gain, 
however, is made by a repeater whose percentile rating places 
him in the lowest quartile. Thus in this group the two lowest 
gains of repeaters are made by one in the lowest quartile and 
the other in the highest quartile of a distribution based on 
percentile ratings in the college aptitude test. It is also 
apparent from this study that individual gains made by 
repeaters do not bear a direct relation to percentile ratings 
on these college aptitude tests. : 
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SUMMARY 3 


1. The measuring instruments used in this study were j 
Achievement Seales in College Zoology, Scale A and Seale B. 
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2. Gains cited in this study are calculated from initial 
scores in the scales given early in the course and from scores 
in the scales given at regular intervals. 

3. In general, the mean scores of the repeaters in the initial 
measurements are higher than the mean scores of students 
taking the course for the first time. The repeaters do not 
maintain this lead during the progress of the course. The 
lead is gradually reduced as the course progresses. 

4. There is little difference between the mean scores of 
repeaters and regular students at the end of the course; the 
repeaters know as much zoology on completing the course as 
do the regular students. Achievement with repeaters seems 
to resolve itself into a question of time consumed in mastering 
the subject rather than greater proficiency in the subject. 

5. The mean gain of repeaters is less than it is for the stu- 
dents who take the course for the first time. 

6. It would seem that repeaters are a definite ‘‘drag’’ in 
any class, that they entail an added expense upon the institu- 
tion with results not commensurate with the expenditure. 
The quality of the product is not in keeping with the time 
devoted to its production. 

7. Repeaters are not restricted to any quartile of a distribu- 
tion of the cases based upon scores in a college aptitude test ; 
they do occur more frequently in the two lower quartiles. 

8. Gains made by repeaters do not seem to bear a direct 
relation to their percentile ratings on the college aptitude tests 
given to entering freshmen at the University of Minnesota. 
Some of the greatest gains were made by repeaters listed in 
the lowest quartile of the distribution based on scores in the 
college aptitude tests. 

9. It would seem that elements other than native ability as 
determined by scores in college aptitude tests make for 
achievement in zoology and the author of this study offers this 
axiomatic summation—ACHIEVEMENT NECESSITATES 
A RIGID APPLICATION AND DEFINITE MOTIVATION 
TO ACCOMPLISH A DESIRED ACADEMIC RESULT. 





NOTES ON THE MEIER-SEASHORE ART 
JUDGMENT TEST 


PAUL R. FARNSWORTH AND ISSEI MISUMI 
Stanford University 


Introduction. This paper reports a series of minor studies of the 
Meier-Seashore Art Judgment Test,1—a booklet in which occur 125 series 
of two pictures each. Each series pair is composed of pictures which are 
identical except for some detail. This slight difference, however, is sup- 
posed to make one of the pair the better. The authors of the present 
study drew their subjects from large classes in elementary psychology in 
which each student was required to serve as a subject in two laboratory 
experiments. Since such service was compulsory, and since the members 
of Stanford psychology classes do not in the main form a specialized 
group there were probably few important selective elements operating. 
It is the guess of the authors that the subjects were, for college students, 
below average in their acquaintance with the arts for Stanford offers no 
work in music and has a relatively small department of fine arts. Of the 
212 subjects, 7 per cent had had one course or more of college art, 1 
per cent had had private instruction in art, and 1 per cent had attended 
art school. 

Relation to Norms. The mean score for the group was 95.5 with a 
sigma of 8.0. The median was also 95.5. This value equals 76.4 per 
cent correct. Such a figure puts the Stanford group between the ‘High 
Average’ and ‘Superior’ classes on the test’s tentative norms. However, 
on new senior high school norms not yet issued by Meier, these median 
and mean scores fall on the 51st percentile. 

Item Agreement. In this portion of the study the test blanks were 
divided into two groups of equal size. The responses to specific pictures 
were tabulated (votes for left or right), and the values for the two 
groups were intercorrelated. The resultant r was .96, indicating that 
the ballot magnitudes for the two groups were quite similar. The ballot 
magnitudes for the two groups were averaged, and are given below in 
terms of per cent. For example, in the case of picture pair, Number 1, 
81 per cent of the ballots were cast for L (left). The starred numbers 
are cases in which the majority of ballots were case contrary to the Meier 


1 Bureau of Educational Research and Service, University of iowa, 
1929. 
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stencil. The per cents given are for the left-hand members of the various 
picture pairs. 


PER CENT OF BALLOTS CAST FOR LEFT PICTURES 











No. % No. % No. % No. % Now %G No. % No. % 
] 81 21. 82 41 


15 61. 24 81. 85 101. 25 121. 
- 86 42. 9 62. 86 82. 63 102. 35 
4 23. 86 43. 85 *63. 49 83. 78 103. 14 123. 
32 24. 5 44 65 64. 30 *84. 46 104. 34 124. 
35 25. 23 45. 25 *65. 49 85. 71 105. 53 125. 
8 26. 87 46. 75 66. 62 86. 20 106. 91 
a.7. 2B. @ HH 42 H 2 U6. US 
97 28. 76 48. 49 68. 29 *88. 58 *108. 59 
9. 85 29. 13 49. 20 69. 66 89. 87 109. 30 
10, 6 30. 94 50. 65 70. 65 90. 33 110. 27 
Ry Bh Ff hh Se S&S ws oa 
12. 8 32. 16 52. 54 72. 72 92. 49 112. 10 
13. 23 33. 90 53. 18 73. 25 *93. 58 113. 78 
14, 5 34. 6 54. 76 74. 68 "94. 48 114. 42 
15. 82 35. 4 55. 72 75. 43 95. 58 115. 73 
16. 13 36. 23 56. 5 76. 61 96. 71 116. 83 
17. S38. WwW oH... @ *7. @ 8. TT. m.. SS 
18. 5 38. 68 58. 5 78. 56 98 23 118. 22 
19. 7 39. 82 59. 95 -.79. 29 99. 91 119. 17 
20. 90 40. 97 60. 95 80. 28 100. 82 120. 39 
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Relation to ‘Intelligence.’ Meier has reported two correlations be- 
tween his art test and ‘intelligence.’ High school students gave a value 
of —.146 + .09 (Terman Group); that of college undergraduates was 
-.018 = .09 (Thorndike and others).2 The present writers found a 
simitar coefficient: +.079 + .06 (Thorndike). 

Reliability. In the same study Meier has given data on reliability.2.3 
The correlation between the odd and even items was .65 when raised by 
Spearman-Brown. A similar procedure with the Stanford data yielded a 
value of .59. Such reliability values are so low as to limit the test in 
its present form to group work. However, future changes in the test may 
perhaps raise its reliability. 

Use of a Scale of Values. The present writers made an attempt to 
compare the judgments obtained by Meier’s procedure with those found 
when a scale of values method was employed. Fifty-six subjects were 
asked to rate the right-hand members of Meier’s 125 pairs of pictures. 


2 Meier, N. C., Aesthetic judgment as a measure of art talent. Un. of 
Iowa Studies, 1, 19, 114, 1926. 

8 Meier, N. C., A measure of art talent. Psych. Monog., 39, 2, 184, 
1928. 
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Seventy-eight subjects were asked to rate the left-hand members. By 
this method, the two fairly large groups of similar university subjects 
together rated the 250 pictures of Meier’s battery. 

The directions were as follows: ‘‘ You are to rate preferences (likes 
or dislikes) for 125 prints. Give a rating of 1 if the print strikes you 
as very beautiful, a rating of 2 if pleasing, 3 if indifferent, 4 if dis- 
pleasing, and 5 if very unattractive.’’ 

It is to be recalled that Meier has designated one member of each pic- 
ture pair as the better one. The means of these ‘better’ pictures ranged 
from 4.26 to 1.78 with a grand mean of 2.71 (sigma .48). The means 
of the ‘worse’ pictures ranged from 4.10 to 1.54 with a grand mean of 
2.73 (sigma .52). The critical difference is, of course, not statistically 
reliable. Sixty-eight of the ‘better’ picture means were of greater 
magnitude than the means of their pair mates; 7 had the same value; 
50 were smaller. 

Summary. Two hundred twelve Stanford subjects who had had little 
formal work in art were given the Meier-Seashore Art Judgment Test. 
The mean and median scores were in accord with norms recently de- 
veloped by Meier at Iowa. There was practically zero relationship with 
Thorndike ‘intelligence.’ The odd-even reliability was low—too low to 
allow the present form of the test to be very useful as a psychological 
tool for anything but group work (r, .59). Two large groups of stu- 
dents agreed quite well (r, .96) as to their votes for the various pair 
members. In the main they agreed with Meier’s stencil. By a scale of 
values method it was found that the ‘better’ members of the picture 
pairs were slightly preferred to the ‘worse’ members. The differences 
were not statistically reliable, however. 














NOTES AND NEWS 


THE JOURNAL OF APPLIED PsycHoLocy, Vol. IV, 1920, published 
‘*Tables to Facilitate the Computation of Coefficients of Correlation by 
the Rank Difference Method’’ by the Scott Company Laboratory. 
Copies of these Tables in handy pamphlet form may now be purchased 
from THE JOURNAL OF APPLIED PsyYCHOLOGy, Ohio University, Athens, 
Ohio, at the following prices: Single copies, 20 cents; ten or more 
copies at 15 cents each; in case of very large orders a somewhat lower 
price will be quoted. 


The first sessions of an Institute of Higher Education were held from 
August 17 to 22, at Cold Springs Resort, Hamilton, Indiana. Informal 
discussions were held during the forenoons, the remainder of each 
day being devoted to swimming, fishing and other forms of recreation. 
It was the unanimous decision of all the participants that the Committee 
on College Entrance Tests of the Ohio College Association could well 
afford to sponsor a two-week session for next year as well as commend 
a like undertaking to other organizations. Most active in this very 
promising movement are: H. A. Toops, Ohio State; Prof. Edgar Yeager, 
Indiana University, and James P. Porter, Ohio University. 


United States Commissioner of Education, William John Cooper, 
recently announced the appointment by the Secretary of the Interior 
of seventeen finance specialists from various sections of the United 
States to act as consultants in the Federal Office of Education’s four- 
year National Survey of School Finance. This is the third national 
educational study now being directed by the Federal Office of Educa- 
tion, and the finance survey administrative organization will be similar 
to that of the other two studies, the Survey of Secondary Education 
and the Survey of Education of Teachers. A sum of $350,000 has been 
authorized for use in this survey. 


The Carnegie Foundation for the Advancement of Teaching has 
recently published the Twenty-fifth Annual Report of the President of 
this Foundation. Dr. Henry Smith Pritchett, President of the Foundation 
since its beginnings in 1906 until August 1, 1930, was succeeded by Dr. 
Henry Suzzallo, former President of the University of Washington. 

The Report reviews the financial history of the Foundation and lists 
in detail the studies and publications with which the Foundation has 
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concerned itself. Copies of the Report and of any of the fifty other 
publications of the Foundation may be had without charge on appli- 
cation to the office of the Foundation at 522 Fifth Avenue, New York 
City. 


The Psychological Corporation in October, 1930, launched a five-year 
study, nation-wide in scope, to provide essential data for the improve- 
ment’ of instruction in English usage, reading and vocabulary. The 
work is directed by Dr. L. J. O’Rourke and is undertaken in cooperation 
with the English Council. The program has been conducted in forty- 
eight states, Hawaii, Porto Rico, and the Philippine Islands with ap- 
proximately 825,000 pupils participating. National norms may be ob- 
tained by those schools taking part in the English usage study last year. 
For samples of the tests and further information in regard to partici- 
pating in the October-November Achievement Test Program address 
The Director of English Program, The Psychological Corporation, 3506 
Patterson Street, N.W., Washington, D. C. 


The Office of Education’s Survey of Land Grant Colleges (Bulletin 
1930, No. 9) which was originally published in two volumes ($1.50 each) 
is now available in 21 parts from the Superintendent of Documents, 
Government Printing Office, Washington, D. C. While the survey report 
refers only to the 69 land-grant colleges, the information is invaluable 
to the faculties and students of all colleges and universities in the 
United States. The following are a few of the titles of the various 
sections: Historical Introduction, Control and Administrative Organiza- 
tion, Work of the Registrar, Teacher Training, Arts and Sciences, 
Research, ete. Prices range from 10 cents to 30 cents for each part. 


Under the auspices of the Child Study Association of America a two- 
day conference will be held in New York City on October 19 and 20, 
1931. Among the topics to be discussed include the following: Social 
and Economic Changes: How Is the College Meeting Them, and Their 
Effect upon Man and Woman in the Marriage Relation; Research in 
Family Life; The Status of Parent Education in the United States in 
Relation to State Programs. Seventy organizations including social, 
welfare and those carrying on programs in parent education are cooper- 
ating with the Child Study Association, forty-four of which will have 
exhibits at the Hotel Pennsylvania. For complete programs those inter- 
ested should address the Child Study Association of America, 221 West 
57th Street, New York City. 


The National Safety Council has organized a Child Education Sec- 
tion, the first meeting of which will be held in connection with the 
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Twentieth Annual Safety Congruss in Chicago, October 12 to 16, 1931, 
at Hotel Stevens. Dr. Randall J. Condon will serve as chairman of 
this new Section and Dr. Thomas W. Gosling, of Akron, as vice-chairman. 
The programs contain such timely topics as the following: The Problem 
of Safety Brought Up To Date; How Can We Apply the Safety Recom- 
mendations of the White House Conference on Child Health and Pro- 
tection to the Schools; Successful Experiments in Teaching Safety to 
High School Students; The Challenge of Safety Education; The Place 
of Safety in the Progressive School. Further information concerning 
the program may be had from the Education Division of the National 
Safety Council, 1 Park Avenue, New York City. 


Out of 40,000 students competing in The Seventh Annual Scholastic 
Awards, the national competition for high-school students conducted by 
Scholastic magazine in Pittsburgh to stimulate creative work in art and 
literature, about 260 received prizes totalling more than $4,500. Over 
3,000 high schools from every state and from all island and territorial 
possessions had representatives in the contest. The Scholastic Awards, 
now in the seventh year, is the oldest of national high-school competi- 
tions and the only national competition for all branches of art work. 











BOOK REVIEWS 


DonaLp A, LAIRD AND CHARLES G. MULLER. Sleep: Why We Need It 
and How to Get It. New York, John Day, 1930. Pp. xi, 214. 13 
charts, 2 tables. 


This is an interesting and amusing book, popularly and carelessly writ- 
ten im semi-journalistic style, and resplendent with anecdotes. It is 
apparently of little scientific value since experiments and procedures are 
not described so that they could be repeated, and exact data are omitted, 
giving space to conclusions. It has an appendix giving a brief ‘de- 
scription of typical experiments,’ and a bibliography of 81 titles which 
are not referred to. There is no index. 

When the author says (p. 97) that the ‘‘ ‘curve of sleep’ does not 
run parallel to or check with curves for muscular relaxation during 
sleep,’’ the reviewer is impressed by the magnitude of the scientific con- 
tribution which would be made by an adequate description of a valid 
method of measuring the latter, which the authors describe in ohms. 
The fact that no such method is available to science for making these 
measurements does not deter the authors from claiming to have made 
them, neither does it prompt the authors to present such a method nor to 
explain how their ohms can possibly be a true measure of muscular 
relaxation. 

The following is an example of carelessness exhibited in the book 
(p. 100): ‘*The longer we sleep the more relaxed we become, this re- 
laxation being greatest during the first hour, when it is equal to that of 
the next two and a half hours of sleep.’’ And again (p. 93): ‘‘We 
are less inclined to dream between the third and fourth hour of sleep.’’ 
(and on the next page) in ‘‘carefully planned and executed studies . 
there seemed to be no one hour when dreams were more likely to occur 
than any other hour.’’ The experiments on loss of sleep are said (p. 19) 
to have been sponsored by the American Association for the Advance- 
ment of Service. 

The following is given to illustrate both style and technique (p. 104ff) : 
‘*Two boys, sleeping soundly and apparently happily, would keep six 
observers busy making accurate computations. Sandwiches and coffee, 
coffee and more coffee were supplied these all-night crews, and not a cup 
was broken—but two spoons are still missing. 

‘*Of course the subjects did not sleep very well the first night the gas 
masks were worn. Nor the second. Nor the third. But after a year of 
wearing masks the boys became so attached to these bed companions that 
when summer vacation came they could hardly sleep without them. 
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‘*Out of all this work and all the discomfort came a small curve which 
could conveniently be drawn on the back of a social sized envelope. But 
it was a tremendously significant curve, again showing how close in sleep 
we are to death. .. .’’ The chapter concludes: ‘‘and most of us do not 
die in our sleep.’’ 

An example of precision (p. 121): ‘‘It would be most helpful if we 
could get a norm of sleep—the ideal amount in hours, minutes and sec- 
onds.’’ And in a sort of expectancy table (p. xi) we find for instance 
that ‘‘if you are now 20, you will sleep 16 years, 8 months, 3 days and 8 
hours. If you are 25, you will sleep 15 years, 0 months, 0 days and 0 
hours, ’’ 

‘*For those who want money, sleep will help get it.’’ (p. 7). The 
reviewer predicts that this law will be empirically confirmed by handsome 
proceeds from the present book. 

The whole volume is a curious combination of anecdotes, arm-chair ad- 
vice, and conclusions from experiments, the results of which are not cited, 
for which credit is not given, and some of which have been disputed by 
other results, apparently unknown to the present authors. The most 
generally applicable essence of the book could be summarized under three 
points. (1) Sleep is necessary. (2) Loss of sleep is sometimes unfortu- 
nate. (3) Everyone sleeps, at times, including famous men and mil- 
lionaires. 

C. R. GARvVEY, 
Institute of Human Relations, 
Yale University. 


Aes Hrpiicka. Children Who Run on All Fours. New York: Whittle- 
sey House (McGraw-Hill). 1931. 418 pp. $5.00. 


The chance observation of an Indian child running on all fours (hands 
and feet) aroused Dr. Hrdlicka’s interest in this phenomenon. Nearly 
thirty years later, having searched out the scattered literature, his inter- 
est was given wide publicity through Science Service in popular maga- 
zines and newspapers. From this publicity he received letters from 
parents and other relatives reporting such behavior in 369 white children 
and eighteen non-whites. Nearly three-quarters of the book is devoted 
to the publication of these letters and some photographs. 

In the analysis which occupies the first one hundred pages the con- 
tents of these letters are summarized. General health, strength, sex, 
heredity and mentality of infants using quadrimembral locomotion are 
noted. Variations of the method, sleeping on all fours, and behavioral 
accompaniments are also mentioned. Accidental causes and imitation are 
dismissed as improbable causes, while heredity, especially phylogenetic, 
is emphasized. 

To evaluate this book is somewhat difficult. As a contribution to a 
specific part of infant behavior it has the unique value of any pioneering 
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work. However, its scientific value is lessened in that no attempt has 
been made to contrast the concomitants mentioned above with similar 
behavior in a control group of non-‘‘all four’’ walkers. Neither has 
there been any systematic endeavor to discover the prevalence of this 
mode of locomotion. From the apparently wide publicity given to the 
request for information and the relatively few cases reported one might 
conclude that the phenomenon is extremely rare. However, no evidence is 
presented to demonstrate this conclusion or any other. 

The primary argument of the author seems to be that his evidence indi- 
cates, or at least presents further facts in regard to, the phylogenetic his- 
tory of man. However, the majority of instances cited in support of the 
relation between this form of behavior and that of animals refer to 
similarities to bears, cats and dogs, none of which appear in the pre- 
human line. Other comparisons are concerned with simian and anthropoid 
behavior, e.g., climbing, prehensility of the toes, position of the hand 
while running. Such comparisons, Wood-Jones has pointed out (Man’s 
place among the mammals, p. 356), probably do not indicate the phylo- 
genesis of man. At best they might be used to demonstrate that man 
has had animal ancestors, but such a thesis hardly needs support to-day. 

C. M. Lovurtrtir, 
Ohio University. 


E. B. Houtr. Animal Drive and The Learning Process. Henry Holt 
and Co. 1913. 

Using as a sub-title the Jamesian concept Radical Empiricism in Vol. 
1 of a two volume series, the second of which is yet unpublished, the 
author trenchantly attacks the absurdities of ‘‘subjectivism’’ and any 
form of the ‘‘ psychological-parallelism.’’ And im doing so he states that 
there are two ways of considering that which ‘‘moves an animal to ac- 
tion.’’ One is that ‘‘chemical energy derived from food is the source of 
all activity ;’’ the other is ‘‘the source of action is feeling, emotion, de- 
sire, or something of that kind.’’ The author of course accepts the 
former way of thinking and puts all action on a physiological basis. He 
goes as far as putting all ‘‘conscious phenomena’’ (not consciousness as 
a mental substance per se) on a physiological basis and herein lies the 
radical aspect of his empirical way of thinking. In order to support 
this way of thinking the author, unlike many writers who marshall ‘‘ver- 
bal magic’’ to support points of view maintained, delves into the most 
critical and crucial experimental physiological literature carrying the 
reader through fact after fact in order to enable him to see the ‘‘hows’’ 
and ‘‘whys’’ of nerve action until finally even the most elaborate reactions 
of man are explained, partially at least, in a physical way without 
positing ‘‘instincts,’’ ‘‘purposes,’’ ‘‘goals’’ (‘‘verbal magic’’) as the 
basis of action in human organisms. The author claims that even the 
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Gestalt protagonists are tending to treat each ‘‘configuration’’ as an 
en-telechy which is a form of ‘‘ verbal magic;’’ and while some forms of 
behaviorism avoid ‘‘ faculty-en-telechy’’ they do so by ruling out con- 
scious phenomena and in so doing flatly deny psychological problems. 
With the introduction chapter on ‘‘Psysiology Versus Verbal Magic’’ 
out of the way the author then plunges into growth and learning of the 
developing organism. Growth and Learning (beginning with the em- 
bryo) for Holt are one continuous process. In support of this statement 
the author marshalls evidence from Coghill and Child’s studies of the 
‘* Psysiological gradients’’ and by making use of Neurobiotaxis (den- 
drites grow towards an active neurone, ete.) he supports his contention 
of growth and function developing simultaneously. This means, of 
course, that learning begins before birth. In fact it is this dendritic 
growth of neuronic connections which makes it possible for certain 
pathways to become ‘‘canalized’’ and to act before birth; this accounts 
for the so-called instincts or behavior patterns which appear at birth 
or soon thereafter. 

By taking ‘‘ neurobiotaxis’’ as the underlying histological aspect of all 
‘‘reflex conditioning’’ and the ‘‘ growth of dendrites, under the stimulus 
of nerve impulse as the basis of learning’’ the author is then ready to 
explain the earliest ‘‘random movements’’ of the organism. Picture the 
early nervous system of higher mammalians before afferent, connecting 
and efferent neurones get ‘‘canalized’’ then one sees the nervous sys- 
tem of man somewhat like the primitive nerve net system. In this stage 
of development any afferent impulses may diffuse or spread so as to ac- 
quire functional connection with any muscle. So much so is this true 
that there are no definite sensory and motor paths or ‘‘reflex ares’’ or 
**instincts’’ or ‘‘ideas’’ in the embryo. And truly ‘‘ John Locke’s doc- 
trine of tabula rasa rests on solid embryological as well as psychological 
grounds.’’ The spreading of nervous impulses so conspicuous in embryo 
remains to a great extent throughout life. However, upon the stimulation 
of certain receptors there comes finally ‘‘canalization’’ which fact ac- 
counts for certain patterns of behavior resulting from the application of 
certain classes of stimuli. Thus Kappa’s law or Pavlov’s law of reflex 
conditioning as applied to Bok’s principle of reflex-circle throws light 
upon ‘‘reciprocal innervation’’ and movements of progression. Thus 
from walking and other movements the author goes to ‘‘ equilibration and 
postural tonus.’’ In Chapter X he treats the education of the sensory 
surfaces. Making use of ‘‘adient’’ and ‘‘avoidiant’’ responses he builds 
up all phases of learning. The chain reflex and cross conditioning are 
two further principles upheld. The avoidance responses being the basis 
of ‘‘trial and error’’ learning. Learning by ‘‘trial and error’’ is very 
different from learning by reflex-circle although the physiological prin- 
ciple (neuro-biotaxis) is the same. Adient responses are acquired on the 
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reflex-circle principle while ‘‘trial and error’’ learning is based on 
avoidance response. 

‘*Echo’’ and ‘‘imitation’’ the author explains on the reflex-circle prin- 
ciple, and herein lies the basis of leadership. The imitative process is 
more far-reaching however than just simple iteration and echo for it in- 
volved EHinfiihling or ‘‘empathy.’’ Im fact all learning when obstacles 
are encountered is of the avoiding kind. To avoid hunger the organism 
explores until a consummatory reaction results. No goal-seeking for this 
author. He accepts Washburn’s observation that in maze learning ‘‘ it 
is the movements nearest ‘success’ that are earliest learned.’’ Learning 
of a serial habit can be explained in terms of Smith and Guthrie’s con- 
ditioning, plus food-adience, plus backward order of elimination or errors. 
Food-adience is in reality only hunger avoidance. 

In discussing integrative action of the nervous system the principle of 
‘*re-enforcement’’ or ‘‘facilitation,’’ according to the author, present 
no great difficulties for it can be explained on the basis of ‘‘swnmation’’ 
of nerve impulses; but not so with ‘‘inhibition.’’ It preseuts great diffi- 
culties. After analysing the many theories of inhibition the author then 
works out from Wendesky’s phenomenon of inhibition and finally makes 
a case out in favor of (although now not in too good repute by some 
physiologists) inhibition by ‘‘overcrowding’’ of nerve impulses at the 
synapse. 

In Chapter XX the author discusses ‘‘ sustained responses,’’ ‘‘locus of 
freedom,’’ ‘‘cross conditioning.’’ Four factors are involved here: (1) 
external reference of all acquired responses; (2) immediate functional 
inter-relations of these responses; (3) tendency toward getting more stimu- 
lus in selection and unification (reflex-circle principle) ; (4) ‘‘ cross-con- 
ditioning. The last named principle the author uses to explain ‘‘moods.’’ 
This cross conditioning explains the ‘‘self-determined’’ person who seems 
to be acting independently of present sense stimuli. However, there are 
internal stimuli acting. In his Chapter on ‘‘ Motor Block’’ which when 
it occurs also leads to ‘‘trial and error’’ one is led to see that self-inter- 
est is at the bottom of all behavior. He ends this Essay with a chapter 
on the fallacies of the ‘‘Organism as a Whole,’’ thories extant today. 
The book ends with a supplementary philosophical essay by Harold 
Chapman Brown on ‘‘ This Material World.’’ The philosophical essay 
is quite consistent with the scientific and conjectural tenets set forth 
by Holt in dealing with living organisms. 

The reviewer is not acquainted with all the physiological literature 
reported, yet one gets the impression that the source material quoted 
have been examined and presented through honest penetrating eyes. One 
is impressed with the painstaking, scholarship and sophistication of the 
author of this book. Only superior scholarship coupled with independence 
of thinking could enable one to cull out. facts and weave into a con- 
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sistent whole the physiological basis of all human behavior. Besides 
its timeliness and extremely interesting positive point of view the book 
has a marvelous bibliography. Whether, ultimately, Holt is right or 
wrong one feels that in most cases his questions raised and answered are 
pointed indeed. 
JAMES R. PATRICK, 
Ohio University. 


ADELBERT Forp. Group Experiments in Elementary Psychology. The 
Maemillan Co., New York, 1931. 241 pages, preface. 

The author aims to provide collateral experimental work for the first 
course in psychology, organized around the usual class-room situation 
having only fair facilities for experimental work. ‘‘ Most of the experi- 
ments are adaptations of classical investigations in general psychology, 
and only a few of the experiments may be definitely classed as ‘‘ applied 
psychology.’’ The material is divided into four sections or chapters. 
The first section on Mathematical Methods in Psychology has clear cut 
problems on Central Tendencies, Variability, Correlation and Probable 
Error. The second section deals with group demonstrations. Here are 
some of the factors that make this an unusual book. The Behavior of 
Unicellular Animals, the Nature of the Nerve Current, and the Matura- 
tion of Instincts are problems not found in most psychology laboratory 
manuals. There are also problems on Color-Blindness, Static Sense, The 
Reaction Time, in this group of 19 problems. The second section deals 
with group experiments which are to be done by the students working in 
pairs. The Nervous System, Weber’s Law, Trial and Error Learning, 
Fluctuation of Attention, and Gestalten are some of the 15 experiments. 
The third section is concerned with demonstrations in applied psychology. 
The Intelligence Test, The Motor Aptitude Test, the Lie Detector, and 
Attention to Advertising Headlines are some of the nine experiments. 

The plan is uniform for all experiments, so that there should be a 
clear idea of how an experiment should be written up. The items in 
order are: the title; abundant references; the aim of the experiment; 
an historical background, touching in an interesting fashion upon the 
results of several similar experiments, followed by some statements 
of the general principles involved; materials needed; procedure described 
clearly and fully; and space to record the significance of the results. 
Blank sheets for data, ruled sheets for charts and curves, etc., are pro- 
vided wherever needed. The drawings showing the mechanical devices 
and how to set them up or to use them are good. 

The book provides abundant pertinent problems, allowing adaptation 
to individual classes, is clearly and interestingly written, and is on the 
whole the best student manual and note book for general psychology the 
reviewer has seen recently. 

J. R. GENTRY, 
Ohio University. 
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